Open jacobwindsor opened 8 years ago
What are the most commonly used file formats for this kind of thing? Is it CSV?
Asking around tells me that tab-delimited files (as text-file) or comma-delimited (EXCEL) are being used. I think it would be nice if users could click on the type of file they have, and let the program then run for that type of file specifically. In this way, the program is faster (I hope) and usable by biologist that do not have any programming experience. See the picture below for an example of my idea ;) The first "button" accepts CSV, the second tab delimited, the third explains how other data files should be changed in order to use the ranker program.
Nice idea. Should be fairly simple to do
Okay nice ;)
I also found that ISATab (http://isa-tools.org/) are the most coomon file format for metabolomics data somewhere.
http://regexr.com/ for advanced users. I'm going to ask (regular) biologists and chemists what they would like the RP to do, display etc.
Should we still only use one metabolites dataset, or can we (and do we want to) included other dataset possibilities (proteomics, (environmental) chemistry, toxicology). And do we want to use another dataset to validate the RP?
There are many file types that are commonly used for large sets of metabolites or other compounds. BioPAX, Octave, SciLab, XML are just a few. It would be great to support all of these formats.
Moreover, formatting of the dataset can vary greatly and the programme currently only allows for the CAS ID followed by the IUPAC name in brackets. A customisable REGEX string as a method parameter would be a much better method to harvest the data from the source file