nfdi4plants / ARCtrl

Library for management of Annotated Research Contexts (ARCs) using an in-memory representation and runtime-agnostic contract systems.
MIT License
11 stars 7 forks source link

[Feature Request] Add "derivesFrom" key in studies.materials.samples. #412

Open xiaoranzhou opened 2 months ago

xiaoranzhou commented 2 months ago

Describe the bug There is not "derivesFrom" key in the studies.materials.samples When using the same ARC to generate ARC JSON, ARCcommander's output has derivesFrom

                            "name": "CAM_01",
                            "factorValues": [...],
                            "derivesFrom": [
                                {

                                    "name": "DB_097",
                                    "characteristics": [

                                        {
                                          "category":{

                                          },
                                          "value":{
                                             "annotationValue":"Arabidopsis thaliana",
                                             "termSource":"",
                                             "termAccession":""
                                          }
                                       }
                                    ]
                                }
                            ]
                        }

But the new ISA json export from ARCtrl does not have "derivesFrom"

Here is the isa specification.

To Reproduce Steps to reproduce the behavior:

  1. Use the following js script to create an ISAJSON
    
    import {Xlsx} from "@fslab/fsspreadsheet";
    import * as ARC from '@nfdi4plants/arctrl';
    import fs from "fs";
    import path from "path";

// Write

export function normalizePathSeparators (str) { const normalizedPath = path.normalize(str) return normalizedPath.replace(/\/g, '/'); }

export async function fulfillWriteContract (basePath, contract) { function ensureDirectory (filePath) { let dirPath = path.dirname(filePath) if (!fs.existsSync(dirPath)){ fs.mkdirSync(dirPath, { recursive: true }); } } const p = path.join(basePath,contract.Path) if (contract.Operation = "CREATE") { if (contract.DTO == undefined) { ensureDirectory(p) fs.writeFileSync(p, "") } else if (contract.DTOType == "ISA_Assay" || contract.DTOType == "ISA_Assay" || contract.DTOType == "ISA_Investigation") { ensureDirectory(p) await Xlsx.toFile(p, contract.DTO) } else if (contract.DTOType == "PlainText") { ensureDirectory(p) fs.writeFileSync(p, contract.DTO) } else { console.log("Warning: The given contract is not a correct ARC write contract: ", contract) } } }

// Read

export async function fulfillReadContract (basePath, contract) { async function fulfill() { const normalizedPath = normalizePathSeparators(path.join(basePath, contract.Path)) switch (contract.DTOType) { case "ISA_Assay": case "ISA_Study": case "ISA_Investigation": let fswb = await Xlsx.fromXlsxFile(normalizedPath) return fswb break; case "PlainText": let content = fs.readFile(normalizedPath) return content break; default: console.log(Handling of ${contract.DTOType} in a READ contract is not yet implemented) } } if (contract.Operation == "READ") { return await fulfill() } else { console.error(Error (fulfillReadContract): "${contract}" is not a READ contract) } }

export function getAllFilePaths(basePath) { const filesList = [] function loop (dir) { const files = fs.readdirSync(dir); for (const file of files) { const filePath = path.join(dir, file);

      if (fs.statSync(filePath).isDirectory()) {
          // If it's a directory, recursively call the function on that directory
          loop(filePath);
      } else {
          // If it's a file, calculate the relative path and add it to the list
          const relativePath = path.relative(basePath, filePath);
          const normalizePath = normalizePathSeparators(relativePath)
          filesList.push(normalizePath);
      }
  }

} loop(basePath) return filesList; }

// put it all together async function read(basePath) { let allFilePaths = getAllFilePaths(basePath) // Initiates an ARC from FileSystem but no ISA info. let arc = ARC.ARC.fromFilePaths(allFilePaths) // Read contracts will tell us what we need to read from disc. let readContracts = arc.GetReadContracts() console.log(readContracts) let fcontracts = await Promise.all( readContracts.map(async (contract) => { let content = await fulfillReadContract(basePath, contract) contract.DTO = content return (contract) }) ) arc.SetISAFromContracts(fcontracts); console.log(fcontracts); return arc }

// execution

await read("Facultative-CAM-in-Talinum").then( arc => {try { fs.writeFileSync('isa-export.json', ARC.JsonController.Investigation.toISAJsonString(arc.ISA, void 0, true)) // file written successfully } catch (err) { console.error(err); }}

)


2. Open export Search "derivesFrom" in the text and cannot found it. 

**Expected behavior**
Have "derivesFrom" which links the sample and source.
HLWeil commented 1 month ago

Hey Xiaoran, thanks for issue.

Technically, the isa json we create follows the specification set by the isa-json schemas. Therefore I changed the name from "Bug" to "Feature Request".

But I see the need to include more features used by other isa-related tools (as with the #392).

xiaoranzhou commented 1 month ago

Hey Xiaoran, thanks for issue.

Technically, the isa json we create follows the specification set by the isa-json schemas. Therefore I changed the name from "Bug" to "Feature Request".

But I see the need to include more features used by other isa-related tools (as with the #392).

Hi Lukas, Thanks for the fast and information. Yes, it is more of a feature request. I have also found that some of the ids are different from the ISA JSON generated by ARCcommander. Previously the ARC commander generated ISA JSON only has the name such as “CAM_01” and “CAM_01ext” In the new ISA-JSON, some prefix like “sample” and “source_” were added to the CAM_01. There are "#Sample_CAM", "#Source_CAM_01" and "#Sample_CAM_01_ext" exist and "#Source_CAM_01" exist only as an id in one input. Could you please explain their usage?