Closed CunliangGeng closed 1 year ago
@CunliangGeng the strain mapping file is normally manually provided by the user - it is the key information that links the genomics data to the metabolomics information. Only when downloaded from the PoDP, these connections are automatically loaded into NPLinker. Of course, this step in the process is tricky as the user may not completely get the format of the mapping file correct. Do you have any suggestions on how to improve this step and make it less "error-prone" and thus more "robust"?
@justinjjvanderhooft This step is indeed a pain point. I think we could take the following measures to improve it:
Thanks for the suggestions. I agree that PoDP is a great entry point, but in practice many users will start from local files - and possibly already run BiG-SCAPE results and/or Molecular Networking runs. The GUI tool sounds like a great suggestion - how much work would that be? It may be a nice aim for an intern?
It cost more than half a year in total for experienced engineers to develop cffinit (see the dev history plot). So I guess the GUI tool would require similar amount of effort. I think it's a very good internship project.
Wow, that is quite an effort indeed. Something to consider - if there is an intern interested, please do encourage to take up this challenge - at least we could make a start with it.... We could re-use bits and pieces of the PoDP add form, as in one of the steps, we basically create the mapping file from previously generated information and direct links to the publicly available metabolomics datafiles....
This tool should not be run in a browser, as browser will restrict the tool from detecting files on the user's machine. So I don't think we could reuse PODP code (web app running in browser). The tool is better to be a desktop application with graphical user interface.
strain_mappings.csv
file