Closed tentrillion closed 7 years ago
Thanks for trying out the analysis.
That code was developed specifically over raw LCMS traces exported from Waters' instruments. Output in those cases were three files 01.nc
, 02.nc
, and 03.nc
, of which the first was the real sample data and hence the 01
file name check.
Cannot make any claims about how the code is going to perform over your sample data. But please try.
I have pushed https://github.com/20n/act/commit/368170097e57b0d71e123c10903f2494e869587f -- changes that assertion to a warning. The changes have been merged into master, so pull.
Best of luck.
Thanks for the quick response. It will be some time before I can try to improved version, but I will report back when I do. In the mean time if any more documention or descriptions of this technique become available, let me know and I will check it out. The data set I want to try is the "standard" data that's been provided with xcms for nearly a decade, so I think it could be a useful way to compare your cool new ML approach with a tried-and-true, but aging, set of algorithms.
Indeed. Side-by-side comparisons are good.
We collected 2400+ LCMS traces over our engineered organisms, and exclusively used this untargeted metabolomics pipeline to analyze the supernatants and pellets. We consistently got the right calls on both our pathway products and discovered side products. To evaluate against XCMS, we did run a short project comparing the outputs, and found this pipeline more robust.
Admittedly, we were not XCMS experts so could not hand tune the parameters to perfection, and it appears you might have the experience to do it properly. If you'd like, I can give you sample LCMS files from our runs. Wild-type vs Engineered; including information on the pathway products this code detects. You can try to make XCMS detect the same.
For the moment, I am going to close this issue. If you want to experiment with our data, please open an appropriate issue later and I'll be happy to assist.
Thanks for publishing this code! It looks very interesting and I wanted to try it.
I installed TensorFlow, python, and all the other dependencies with conda on MacOSX, cloned the repo, and was able to run
python bucketed_differential_deep.py -h
successfully, so I assume TensorFlow, keras etc. is functioning.I moved the
faahKO
data, which is in netCDF format, to the alcms_data
directory I made in theact/reachables/src/main/python/DeepLearningLcmsPeak
directory, and then tried this:The result is something I don't completely understand about the proper ending of file names?
What is the right format for the filenames I want to supply to this code?