Open sp-eldata opened 6 years ago
@sp-eldata agreed. the "databases" included by default are unlikely to be useful.
@chubukov Ideas around what we could do? Compile something more extensive or link to public databases?
Best would be to grab the latest version of HMDB or KEGG. But both have a lot of errors. We have a mildly curated version we could contribute, but it's still not perfect. You could also just compile some recent public metabolomics datasets and just take formula and HMDB or KEGG id for the metabolites detected.
@chubukov Could we start with the one that Agios has? That will be a significant improvement if not perfect.
@Raghavdata @sahil21 Do we have anything internally? Another user had requested a DB for MS2.
I think if you're going to include ms/ms databases, it would be good to make a big effort to make sure the fragmentation widget and all the other tools that would interact with that database are actually working properly.
I'll try to get you what we have.
V.
@sp-eldata We use a 1700 compound (metabolites only) DB and a 2700 compound (metabolites plus a few other small molecules such as drugs) internally. These have been curated from KEGG. This is specifically for MS1. We dont have something like this for MS2.
I guess the "knowns" table is actually pretty close to my second suggestion (list from a typical publication). I would take off the retention times though (no reason to think they'd match anyone's method).
@Raghavdata that sounds like a good list.
@Raghavdata Let's ship these out for MS?
@chubukov How do we ship out a good MS/MS database? I don't think we are using anything internally. Any public available databases worth looking at? I know about METLIN.
@sp-eldata I don't think it makes sense to ship an "ms/ms database" that only has a single precursor and product m/z (which is what the maven source files have). I think most people expect such a database to have a full fragmentation pattern. That's why I was asking if we even really support the related features in Maven.
If you do go in that direction, there is some public stuff on MassBank. HMDB also has spectra. NIST and Metlin are good but not free. I'm sure there are many others -- I'm actually not an expert on this.
@sahil21 2717_Compounds_DB (1).csv.zip
From a user at UC Denver: