Closed ghost closed 4 years ago
Hi! Thank you for report this issue to us.
The general problems is the obo version issue from the mzML file. You probably will see an 'xxx.obo file is missing' error when you run the LipidHunter_debug.exe
Please download the obo.zip
patch for LipidHunter2.
Download obo.zip
For source code version users:
obo.zip
, copy and replace all .obo files in the obo subfolder under /site-packages/pymzml/obo/
For windows users:
obo.zip
, copy and replace all .obo files in the obo subfolder under LipidHunter2 software folder.This issue is currently fixed in our source code version. If you can run the source code version, please change to the latest Master branch committed on Jan 22, 2020.
Please give us an feedbJan 22, 2020ack if the issue is fixed, thanks!
Please note that the previous LipidHunter release is using pymzml 0.7.8 and PySide The current Master branch committed on Jan 22, 2020 is using pymzml 2.4.0+ and PySide2
For more details about this obo issue, see our pull request to pymzml: Solution for obo file related errors #134 https://github.com/pymzml/pymzML/pull/134
Please use python 3.7 for the latest LipidHunter and see if our sample dataset is running or not. You can download the sample dataset from: https://github.com/SysMedOs/lipidhunter/releases/download/LipidHunter2_RC/TestData.zip
I am using Python 3.7.4 and am using the latest branch of LipidHunter and have pymzml version 2.4.6 installed. I also copied and replaced the obo files in pymzml site packages with the obo files you supplied . I ran LipidHunter again this time with 4 cores and 16GB dedicated RAM and let it run for 7 hours on my file and it still did not produce any result files. I have used pymzml before as part of pyqms and havent had any problems there. I have also used OpenMS mzml input architecture on the same file and havent faced any problems. At some point in the past, I also forked and made minor changes to the SpetraReader.py file that comes with LipidHunter and it runs with no issues, but still takes quite a long time.
That's strange to me. How long you need to run the sample dataset in the link above? Can you kindly provide the mzML file so I can have a look, thanks! And please tell me which instrument you are using and how you convert raw file to mzML. Please provide the conversion parameters if you can. I will try to solve this issue as soon as possible.
I just ran the test files and the program is now running to completion. I ran the G_Pos_Thermo_Orbi.mzML file and hunted for Traclyglycerols [M+H]+. The program could not identify any lipids but managed to run and finished in 101.546s. May you please tell me the settings that you use when converting Thermo.RAW files to mzML files using presumably MSConvert. I will then try the same settings and then run my file again and then inform you on the result.
We are using now proteowizard 3.0.20027 and above. Please have a look of following screenshot.
The sample dataset should be used for TG with [M+NH4]+. You can have a closer look of the parameters for this file in our user guide: https://github.com/SysMedOs/lipidhunter/releases/download/LipidHunter2_RC/LipidHunter_UserGuide.pdf
For mzML files from Thermo files, you should start from 1 min and use MS1 threshold 5000 and MS2 Threshold 100 as default. You can try to set it to 10000 and 1000 to make it faster. It would be great if you can send us some screenshot from the terminal so that I can see where you spent most of the time. Generally LipidHunter read mzML file at the speed of 2 min RT / 1 min processing time. We never experienced any identification took longer than 45 min for PLs and 90 min for TGs. I hope you can get identification faster after convert files again.
Here is a link to a file that I am using : https://syncandshare.lrz.de/getlink/fiBUsRK8CVwVmZNZG8tjrxr4/PC.mzML
The above file was downloaded from Metabolites as a Thermo.RAW and converted it using the parameters your showed above.The file was enriched for phosphatidyl choline and phosphatidlyethanolamine. I am currently comparing software and have managed to obtain results ( identified lipids ) from a pipelines using LipidFinder, Lipyd and ALEX123. I selected Phosphatidylcholine [M+HCOO]- as a target lipid class and the program hasn't finished running after 2 hours.
Hi! I've looked your mzML. There is something very strange in the mzML file. There are very less peaks in MS2 spectra, e.g. for m/z 804.5 there are less than 4 peaks in the mzML file you share with me. Please have a look of this screenshot I got from the proteowizard:
There peaks are low in numbers and intensity. The ppm error of FA [M-H]- fragments are quite large. | FA | Theo. m/z | Obs. m/z | ppm |
---|---|---|---|---|
FA16:0 [M-H]- | 255.2324 | 225.20 | -126 | |
FA18:1 [M-H]- | 281.2481 | 281.07 | -633 |
LipidHunter is not yet optimized for this kind of MS2, since we usually identify lipids from high resolusion LC-MS (MS2 ppm < 100)
Here is an example of PC [M+HCOO]- identified by LipidHunter from some other dataset.
Please send us the raw file if it is possible. If this is the conversion issue, then we can find the working version of Proteowizard for you data. If it is the spectra quality issue, I will try to see if we can tune LipidHunter to work under this ppm range.
It looks like that you selected something wrong in the Threshold settings in the conversion.
Since the Data points
column is always 10 in the mzML you just shared.
Please chose the Absolute intensity
in the interface, as shown in the screenshot below.
Please check if all parameters are exactly the same as this screenshot above.
Your raw file should give an mzML file more than 100MB at least.
I have not done any experimental analysis myself and am not attached to any lab, therefore all the files I am using were downloaded from MetaboLights database. I only downloaded files from the repository that used ESI - nanoLC Thermo Orbitrap Fusion in their analysis pipeline. I retried the conversion using the parameters you mentioned and an LC-MS only specific lipidomics Thermo Raw file and LipidHunter did not run to completion. Below are links to both .RAW and .mzML files of both LC-MS/MS and LC-MS files for you to query.
https://syncandshare.lrz.de/getlink/fiKPTSupcsEA2cWFkRfQQER3/LCMS-OF-Neg.mzML
https://syncandshare.lrz.de/getlink/fi7ZvdT9wjPqk2oa51njBKzT/LCMS-OF-Neg.raw
https://syncandshare.lrz.de/getlink/fiBUsRK8CVwVmZNZG8tjrxr4/PC.mzML
https://syncandshare.lrz.de/getlink/fi4UGqcJoCPVxqCu5iXJygK3/PC.raw
This is the output thats coming out from the console.
>>> Hunter started ... Please wait ...
Parameters used are as following
[parameters] vendor = thermo experiment_mode = LC-MS lipid_class = PC charge_mode = [M+HCOO]- fawhitelist_path_str = C:\Program Files (x86)\LipidHunter\ConfigurationFiles\1-FA_Whitelist.xlsx score_cfg = C:\Program Files (x86)\LipidHunter\ConfigurationFiles\2-Score_weight_PL.xlsx mzml_path_str = C:\Users\kunda\Documents\Computational-Lipidomics\RawFiles\LCMS-OF-Neg.mzML img_output_folder_str = C:\Users\kunda\Documents\Computational-Lipidomics\RawFiles\LCMSOFLipidHunterOutput xlsx_output_path_str = C:\Users\kunda\Documents\Computational-Lipidomics\RawFiles\LCMSOFLipidHunterOutput\LCMSOF.xlsx rt_start = 1.0 rt_end = 25.0 mz_start = 500.0 mz_end = 1000.0 dda_top = 6 pr_window = 0.75 ms_th = 5000 ms_ppm = 20 ms2_th = 100 ms2_ppm = 50 ms2_infopeak_threshold = 0.001 rank_score_filter = 40.0 score_filter = 40.0 isotope_score_filter = 80.0 lipid_specific_cfg = C:\Program Files (x86)\LipidHunter\ConfigurationFiles\3-Specific_ions.xlsx core_number = 3 max_ram = 5 img_type = png img_dpi = 300 hunter_folder = C:\Program Files (x86)\LipidHunter hunter_start_time = 2020-02-19_10-53-33 rank_score = True tag_all_sn = True fast_isotope = False ms_max = 0
Hi,
I just downloaded the raw file, converted to mzML by my self, and run LipidHunter. It took around 15min for me from converting file to obtain the results like below.
The Main reason that you did not get result is that the conversion to mzML was not correct. This is the screenshot of the MSconvert when I convert this file: Please check all fields marked in the screenshot The file size after conversion should above 8 MB / min for thermo file (this file is 270 MB in the screenshot).
If you managed to convert the mzML correctly, You can use the SeeMS tool from proteowizard to have a look. It should be similar to the screenshot below: If the mzML file is fine, I think you will have no problem to run LipidHunter.
This file you got have MS2 acquired in LIT, so that you have to use higher MS2 ppm, e.g. 900. LipidHunter can still work with this MS2 resolution in MS2. See the settings I used: There are some mass shift on MS1 level when you check some typical PC lipids, thus I set MS1 ppm to 100.
I would also recommend you to check your MS2 ppm range you used in your previous data analysis. The correct range of MS1 and MS2 ppm settings can give you better result.
The full settings is:
[parameters]
vendor = thermo
experiment_mode = LC-MS
lipid_class = PC
charge_mode = [M+HCOO]-
fawhitelist_path_str = /home/ni/sysmedos/lipidhunter/ConfigurationFiles/1-FA_Whitelist.xlsx
score_cfg = /home/ni/sysmedos/lipidhunter/ConfigurationFiles/2-Score_weight_PL.xlsx
mzml_path_str = /home/ni/Documents/KSachi/PC.mzML
img_output_folder_str = /home/ni/Documents/KSachi/Results/PC
xlsx_output_path_str = /home/ni/Documents/KSachi/Results/PC_test.xlsx
rt_start = 3.0
rt_end = 25.0
mz_start = 600.0
mz_end = 1000.0
dda_top = 6
pr_window = 0.85
ms_th = 1000
ms_ppm = 100
ms2_th = 10
ms2_ppm = 900
ms2_infopeak_threshold = 0.001
rank_score_filter = 40.0
score_filter = 40.0
isotope_score_filter = 80.0
lipid_specific_cfg = /home/ni/sysmedos/lipidhunter/ConfigurationFiles/3-Specific_ions.xlsx
core_number = 3
max_ram = 5
img_type = png
img_dpi = 300
hunter_folder = /home/ni/sysmedos/lipidhunter
hunter_start_time = 2020-02-19_14-04-52
rank_score = True
tag_all_sn = True
fast_isotope = False
ms_max = 0
Please find the complete out put in this zip package: PC_test_KSachi.zip
Based on this preliminary results, you can optimize the parameters and run again. e.g. set following parameters to get faster run and better results quality.
rt_start = 5.0
rt_end = 15.0
mz_start = 700.0
mz_end = 900.0
dda_top = 6
pr_window = 0.85
ms_th = 1000
ms_ppm = 80
ms2_th = 10
ms2_ppm = 900
rank_score_filter = 50.0
score_filter = 50.0
You can also change ConfigurationFiles/1-FA_Whitelist.xlsx
to add more Fatty Acids for phospholipids.
Hope this time you can get LipidHunter running.
Hi!
I had a look of the mzML files you converted, they are fine. The PC.mzML gives exact the same results as I posted above. However, there is something not correct for the file LCMS-OF-Neg.raw
and LCMS-OF-Neg.mzML
.
The LCMS-OF-Neg.raw
is acquired in positive mode, see screenshot below showing a typical positive mode spectra with m/z 184 :
and LCMS-OF-Neg.mzML
also says it is a positive mode spectra:
Due to the precursor list in this file, there are NO TG with adduct [M+NH4]+ selected for MS2. Currently LipidHunter identify phospholipids in negative mode only. Thus, I recommend you skip this file for LipidHunter and check the identification results manually from other software if the TG or phospholipids identified is correct in polarity and fragmentation pattern is correct. We always recommend to manually review at least 5 to 10 lipid manually from the software reports, this will give you better idea of the identification quality and give you more solid results.
Wish you all the best for your analysis.
I have been trying to use LipidHunter but have not been able to get any results the 5 times I have tried using the software. Below are the parameters that I have used in my hunt for lipids:
vendor = thermo experiment_mode = LC-MS lipid_class = PC charge_mode = [M+HCOO]- fawhitelist_path_str = C:\Program Files (x86)\LipidHunter\ConfigurationFiles\1-FA_Whitelist.xlsx score_cfg = C:\Program Files (x86)\LipidHunter\ConfigurationFiles\2-Score_weight_PL.xlsx mzml_path_str = C:\Users\kunda\Documents\Computational-Lipidomics\RawFiles\Experiment2\PC.mzML img_output_folder_str = C:\Users\kunda\Documents\Computational-Lipidomics\RawFiles\Experiment2\LipidHunterOutput xlsx_output_path_str = C:\Users\kunda\Documents\Computational-Lipidomics\RawFiles\Experiment2\LipidHunterOutput\LipidHunterPCOutput.xlsx rt_start = 0.0 rt_end = 10.0 mz_start = 500.0 mz_end = 1000.0 dda_top = 6 pr_window = 0.75 ms_th = 1000 ms_ppm = 19 ms2_th = 10 ms2_ppm = 49 ms2_infopeak_threshold = 0.001 rank_score_filter = 40.0 score_filter = 40.0 isotope_score_filter = 80.0 lipid_specific_cfg = C:\Program Files (x86)\LipidHunter\ConfigurationFiles\3-Specific_ions.xlsx core_number = 3 max_ram = 5 img_type = png img_dpi = 300 hunter_folder = C:\Program Files (x86)\LipidHunter hunter_start_time = 2020-02-11_14-24-09 rank_score = True tag_all_sn = True fast_isotope = False ms_max = 0
I have tried different iterations of these parameters. I have all the dependencies installed and have have managed to use them all separately without any problems (e.g pymzml etc). Any ideas on how I can run the program to completion and produce an output xlsx file with identified lipid classes ?