Closed JianAtSeer closed 2 years ago
Hi, It looks as the mzML import is not integrated correctly. As for the mzML files, the spectrum title is not defined. Could you upload and share one of these files so that we can ensure compatibility?
Hi, Thanks for the quick reply. Actually, after playing around with it a little bit, I think the issue could be somewhere else. I was able to run the pipeline complete when there is a single file. But it fails when there are multiple files, specifically during the step of feature finding.
Specifically in file interface.py, the following code section seems to raise an error when the file is not .raw or .d, I am not sure if this is intended? Is feature finding not supported for mzML files? The pipeline seems to run fine when i just input a single mzML files, though
if step.__name__ == 'find_features':
base, ext = os.path.splitext(files[0])
if ext.lower() == '.d':
memory_available = psutil.virtual_memory().available/1024**3
n_processes = max((int(memory_available //25 ),1))
logging.info(f'Using Bruker Feature Finder. Setting Process limit to {n_processes}.')
elif ext.lower() == '.raw':
memory_available = psutil.virtual_memory().available/1024**3
n_processes = max((int(memory_available //8 ), 1))
logging.info(f'Setting Process limit to {n_processes}')
else:
raise NotImplementedError('File extension {} not understood.'.format(ext))
Good catch. I made some fixes in the https://github.com/MannLabs/alphapept/tree/qc_fixes branch. Do you mind testing if this solves the problem? This will then be included in the next release.
Thanks. I just tested the fixes and it works! On side note i was wondering if you can help me with a different issue. While I was testing the branch codes. I noticed, some of my mzML files takes a really long time for feature finding to run (i.e. > 1 day). But my other files finish in like < 20 min. They are all from similar samples. I was wondering if you can help me see why this set of files is special and takes so long to do feature finding. here is the file: https://seer.box.com/s/yofs6w3vy3twodsbiigf2d1yj8z1kime Thanks a lot.
Hi, this sounds like a bug. Thanks for sharing the file, I will investigate.
Hi, Just checking to see if you have any clue on the reason for the long-running feature detection. Also do you want me to open this as a separate issue? As this is not exactly related to the original issue I open the ticket about. Thanks
Hi, yes I could reproduce the bug, it is probably some runtime condition and probably have time to investigate / fix this tomorrow. Good idee to open another issue, then we reference this properly.
Bug was related to having zero intensities in the mzML, should now be fixed. Feel free to check out the develop branch, otherwise it will be included in the next release.
Great! Thanks. I just confirmed the fixes works.
Describe the bug I tried to run the pipeline on a two mzml files, but I got the error An exception occured running AlphaPept version 0.3.28: File extension .mzML not understood.
I read in the documentation that the pipeline is relying on pyteomics to read/parse mzml files. So i tried to load the mzml file with pyteomics separately and it seems to be fine ....................................................
File extension .mzML not understood. .................................................... To Reproduce Steps to reproduce the behavior:
Expected behavior The pipeline finish all the steps
Screenshots
Version (please complete the following information):
Additional context Add any other context about the problem here. Attached log files or upload data files if possible.