griquelme / tidyms

TidyMS: Tools for working with MS data in untargeted metabolomics
BSD 3-Clause "New" or "Revised" License
51 stars 11 forks source link

Example wrong? #2

Closed sorenwacker closed 3 months ago

sorenwacker commented 3 years ago

Hi, the library looks very promising although I have some starting issues. I was following the quickstart guide.

https://tidyms.readthedocs.io/en/latest/quickstart.html

import os
# creates a list of path to each data file to analyze
path = "data"
file_list = [os.path.join(path, x) for x in os.listdir(path)]
roi, feature_data = ms.detect_features(data_path, separation="uplc",
                                        instrument="qtof")

It seems like data_path actually has to be file_list. The errors could be a bit more informative. When passed in the above example it returns File is a directory for some reason. At least with my custom input. Maybe you could add a type check and return a more informative message. It would also be nice if pathlib objects would be supported.


from pathlib import Path
Path('/mydatadir/')
sorenwacker commented 3 years ago

And the feature detection seems to be on a single processor file by file. Have you thought about parallelizing that process?

griquelme commented 3 years ago

Hello, sorry for the late reply. Yeah you are totally right, it should be file_list. I'll fix the documentation as soon as I can.

Regarding the input for the detect_features function, the docstring specifies that the input should be a list of strings that represent the absolute path to the raw files. However, type checking would be a useful addition. Also, currently I am working in creating a new class to manage feature detection and correspondence so I'll try to include your suggestions of using pathlib and parallelization in this new implementation.