qcxms / QCxMS

Quantum mechanic mass spectrometry calculation program
https://xtb-docs.readthedocs.io/en/latest/qcxms_doc/qcxms.html
GNU Lesser General Public License v3.0
42 stars 22 forks source link

No chemical information in qcxms.out #79

Open anani-a-missinou opened 1 year ago

anani-a-missinou commented 1 year ago

Hi dear developer,

Basic information

Our laboratory works on the specified metabolites involved in the ecological interactions of crops in the Brassicaceae family (Within this group, we can find crops, e.g., broccoli, cabbage, cauliflower, kale, Brussels sprouts, turnip, or rapeseed.

We are interested in your quantum-chemical electron ionization mass spectra QCxMS program to generate CID/EI spectra of known plant small molecules for which authentic references are almost absent.

My operating system: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz 2.21 GHz, 16.0 GB (15.8 GB usable)

What happened?

The CREST protonation tool works very well and creates protonated file. First, we wanted to run EI. The first execution of the qcxms program generated two files qcxms.gs and trjM. Afterward, the second execution of qcxms program generated xtb.last, qcxms.Mspec.xtb2, qcxms.Mspec.tbxtb files and created a TMPQCXMS folder which contains several subfolders containing qcxms.in, qcxms.start, and start.xyz.

However, when running the command "pqcxms -j 12 -t 6 &", the qcxms.out file below was created and presumably does not contain chemical information. qcxms.out.txt

Thank you

JayTheDog commented 1 year ago

Hello, that sounds very interesting, thanks for sharing! Well, the problem is written right in the output:

 changed by input:
># EMPTY INPUT FILE                                                              
Mon Oct  2 22:58:22 CEST 2023
ERROR STOP  -- Unrecognized Keyword -- 

Normally there would be an automatic creation of a qcxms.in file, which contains keywords as described here. I guess for a first approach you could fix this by creating a file called qcxms.in where you just put the following:

ei

If you want more control, you could add new lines there as well, e.g. increasing the number of trajectories, electronic impact energy, etc. (see the link above on how the corresponding commands are called)

@gorges97 : can you check why the input file is not created automatically?

gorges97 commented 1 year ago

Hello, this is a strange error and unfortunately I cannot reproduce it. The error message "ERROR STOP -- Unrecognized Keyword -- " has to come from an unrecognized keyword given via the command line. The pqcxms script executes the command "qcxms --prod" in every TMP.XX directory by writing and executing the wrapped_qcxms script. Can you check what is written in the wrapped_qcxms script that is written after the pqcxms script is executed? And you could try to to execute "qcxms --prod" in one of the TMP directories to see if the error is still happening. If not, the problem is in the bash script pqcxms.

anani-a-missinou commented 1 year ago

Hi dear @JayTheDog,

Sorry for the silence and late, I just got back from vacation.

I previously executed the "qcxms" command without providing any "qcxms.in" file because as described your description https://xtb-docs.readthedocs.io/en/latest/qcxms_doc/qcxms_run.html# input-keywords-in-qcxms-in assumed that EI calculation would be conducted.

I just now re-executed the "qcxms" command with "ei" as a parameter in "qcxms.in" file. This takes several hours of running.

thank you for help

anani-a-missinou commented 1 year ago

Hi dear @gorges97,

Sorry for the silence and late, I just got back from vacation.

I ran the "qcxms --prod" command in one of the TMPQCXMS subfolders in which the following files had been generated (qcxms.out, qcxms.in, qcxms.start, start.xyz). After running the script the following ones were generated (1.1.xyz, 1.2.xyz, 1.3.xyz, ion.out, neutral.out, xtb.last, MDtrj.98.0.1, and qcxms.res), but I notice that the qcxms.res file is empty.

$ls -l -rwxrwxrwx 1 amissinou amissinou 1192 Oct 9 00:09 1.1.xyz -rwxrwxrwx 1 amissinou amissinou 262 Oct 9 00:09 1.2.xyz -rwxrwxrwx 1 amissinou amissinou 944 Oct 9 00:09 1.3.xyz -rwxrwxrwx 1 amissinou amissinou 799788 Oct 9 00:09 MDtrj.98.0.1 -rwxrwxrwx 1 amissinou amissinou 1183 Oct 9 00:10 ion.out -rwxrwxrwx 1 amissinou amissinou 1123 Oct 9 00:10 neutral.out -rwxrwxrwx 1 amissinou amissinou 2 Oct 8 20:16 qcxms.in -rwxrwxrwx 1 amissinou amissinou 2255 Oct 8 20:19 qcxms.out -rwxrwxrwx 1 amissinou amissinou 0 Oct 8 21:22 qcxms.res -rwxrwxrwx 1 amissinou amissinou 3433 Oct 8 20:16 qcxms.start -rwxrwxrwx 1 amissinou amissinou 2968 Oct 8 20:16 start.xyz -rwxrwxrwx 1 amissinou amissinou 1243 Oct 9 00:09 xtb.last

Thank you for help

gorges97 commented 1 year ago

Could you please send me the file qcxms.out? The generated files seem to be OK, but the fact that the qcxms.res file is empty is really strange.

anani-a-missinou commented 1 year ago

For me is same, no chemical information in qcxms.out

qcxms.out.txt

anani-a-missinou commented 1 year ago

Hi dear @JayTheDog,

The running of gcxms with the command 'ei' in the file "qcxms.in" (as you recommended) has just finished. Attached are the tracks of the execution (qcxms_tracks.txt). The results are identical to those issued with no file "qcxms.in" file. I add qcxms.out, wrapped_qcxms attached as txt files.

wrapped_qcxms.txt qcxms.out.txt qcxms_tacks.txt

NB: I noticed that the wrapped_qcxms is created temporarily when creating qcxms.out, then is automatically deleted afterwards. Is this normal?

Is there a step after the protonation step before running qcxms?

I am sending you an example of our data in order to help to determine the origin of bug. You will find in zip file, the struc.xyz for example. and the deprotonated.xyz, protonated.xyz, qcxms.in, coord file (obtained from crest tool).

files_in.zip

Thank you for your help

gorges97 commented 1 year ago

Hello dear @anani-a-missinou,

It seems that you are using an old pqxms script which creates a wrapped_qcxms file with the command "qcxms -prod", but it has to be "qcxms --prod" with two dashes! Please download the latest pqcxms script from the 5.2.1 release package and use it for your calculation, this should fix your problem.

It is correct that the wrapped_qcxms is deleted after running pqcxms.

If you want to calculate a spectrum for a protonated structure, you need to take a structure (usually the lowest in energy) from the protonated.xyz file and copy its coordinates into the input geometry file. QCxMS does not read the protonated.xyz file from CREST.

anani-a-missinou commented 1 year ago

Hi dear @gorges97

I installed version 5.2.1, but it killed while running. This is track.

########## (base) amissinou@REN-1349-A104:/mnt/d/Parts/DATA/Metabolomics/Database/BraChemBD/chemical_fragmentation/crest-enso-qcxms/m321.107/results$ qcxms Fri Oct 13 13:02:54 CEST 2023

                  *********************************************
                  *                                           *
                  *            Q   C   x   M   S              *
                  *                                           *
                  *                  V5.2.1                   *
                  *        Sep 20 18:00:00 CEST 2022          *
                  *                                           *
                  *                S. Grimme                  *
                  *                J. Koopman                 *
                  * Mulliken Center for Theoretical Chemistry *
                  *             Universitaet Bonn             *
                  *                                           *
                  *********************************************

QCxMS is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

QCxMS is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

Cite this work as: S.Grimme, Angew.Chem.Int.Ed. 52 (2013) 6306-6312.

for the CID module: J. Koopman, S. Grimme, J. Am. Soc. Mass Spectrom., (2021), DOI: 10.1021/jasms.1c00098

with respect to negative/multiple charges: J. Koopman, S. Grimme, ChemRxiv, (2022), DOI: 10.26434/chemrxiv-2022-w5260

for the GFN1-xTB implementation: V. Asgeirsson, C.Bauer, S. Grimme, Chem. Sci. 8 (2017) 4879

for the GFN2-xTB implementation: J. Koopman, S. Grimme, ACS Omega 4 (12) (2019) 15120-15133, DOI: 10.1021/acsomega.9b02011

changed by input:

ei # Mass Spec. Method: ei (default) cid, dea

          ************************************************************
                   Mode:          Electron Impact (EI)
          ************************************************************

----- Internal program parameters -----

QC Program : xTB QC Level : GFN2-xTB Dispersion : D4 MO spectrum with : xTB

M+ Ion charge(charge) : 1 total traj. (ntraj) : 950 time steps (tstep) : 0.50 fs sim. time / MD (tmax) : 5.00 ps Initial temp. (tinit) : 500.00 K

------------ EI settings ----------- E(e-impact) (eimp0) : 70.00 eV Iee/atom (ieeatm) : 0.60 eV/atom relax. time (trelax) : 2000.00 fs


qc path /usr/local/bin/

xtbhome directory ~/.XTBPARAM/

Generating uniform velocity distribution, T= 500.0 Groundstate eTemp set to 298.149993896484

initializing GFN2-xTB ... initialization successful! energy = -70.50456 charge = 0 mult = 1 etemp = 298.1 M trajactory, equilibration ... 23750

         E N T E R I N G   M D   M O D U L E

step time [fs] Epot Ekin Etot error #F eTemp frag. T 500 250. -70.42058 0.0783 -70.34224 0.0000 1 298. 480. 1000 500. -70.42332 0.0835 -70.33980 0.0000 1 298. 519. 1500 750. -70.41432 0.0756 -70.33868 0.0000 1 298. 498. 2000 1000. -70.43101 0.0916 -70.33943 0.0000 1 298. 486. 2500 1250. -70.41348 0.0727 -70.34080 0.0000 1 298. 480. 3000 1500. -70.38825 0.0859 -70.30237 0.0000 1 298. 475. 3500 1750. -70.39934 0.1113 -70.28803 0.0000 1 298. 493. 4000 2000. -70.40031 0.1129 -70.28738 0.0000 1 298. 505. 4500 2250. -70.37775 0.0898 -70.28793 0.0000 1 298. 515. 5000 2500. -70.37615 0.0874 -70.28877 0.0000 1 298. 522. 5500 2750. -70.41564 0.0911 -70.32449 0.0000 1 298. 524. 6000 3000. -70.41068 0.0859 -70.32475 0.0000 1 298. 521. 6500 3250. -70.40912 0.0831 -70.32601 0.0000 1 298. 518. 7000 3500. -70.40036 0.0744 -70.32599 0.0000 1 298. 515. 7500 3750. -70.41690 0.0930 -70.32392 0.0000 1 298. 513. 8000 4000. -70.41627 0.0909 -70.32542 0.0000 1 298. 511. 8500 4250. -70.41617 0.0895 -70.32670 0.0000 1 298. 510. 9000 4500. -70.39978 0.0798 -70.31997 0.0000 1 298. 509. 9500 4750. -70.41270 0.0894 -70.32327 0.0000 1 298. 507. 10000 5000. -70.42769 0.1015 -70.32623 0.0000 1 298. 507. 10500 5250. -70.40209 0.0751 -70.32702 0.0000 1 298. 505. 11000 5500. -70.41695 0.0950 -70.32200 0.0000 1 298. 504. 11500 5750. -70.39788 0.0738 -70.32410 0.0000 1 298. 503. 12000 6000. -70.41963 0.0949 -70.32477 0.0000 1 298. 503. 12500 6250. -70.40288 0.0792 -70.32367 0.0000 1 298. 502. 13000 6500. -70.40974 0.0802 -70.32952 0.0000 1 298. 502. 13500 6750. -70.40088 0.0761 -70.32473 0.0000 1 298. 502. 14000 7000. -70.41357 0.0873 -70.32626 0.0000 1 298. 501. 14500 7250. -70.41777 0.0902 -70.32752 0.0000 1 298. 500. 15000 7500. -70.41690 0.0923 -70.32461 0.0000 1 298. 500. 15500 7750. -70.40554 0.0823 -70.32320 0.0000 1 298. 499. 16000 8000. -70.42759 0.1013 -70.32625 0.0000 1 298. 500. 16500 8250. -70.41316 0.0869 -70.32623 0.0000 1 298. 499. 17000 8500. -70.42560 0.0954 -70.33024 0.0000 1 298. 499. 17500 8750. -70.42019 0.0955 -70.32467 0.0000 1 298. 499. 18000 9000. -70.42567 0.1028 -70.32291 0.0000 1 298. 499. 18500 9250. -70.40393 0.0783 -70.32562 0.0000 1 298. 498. 19000 9500. -70.42418 0.0980 -70.32615 0.0000 1 298. 498. 19500 9750. -70.40771 0.0867 -70.32099 0.0000 1 298. 498. 20000 10000. -70.40700 0.0789 -70.32806 0.0000 1 298. 498. 20500 10250. -70.40355 0.0762 -70.32737 0.0000 1 298. 498. 21000 10500. -70.40617 0.0812 -70.32497 0.0000 1 298. 497. 21500 10750. -70.41766 0.0920 -70.32562 0.0000 1 298. 497. 22000 11000. -70.41699 0.0922 -70.32475 0.0000 1 298. 497. Killed

Thanks!

gorges97 commented 1 year ago

I just meant that you have to use a newer pqcxms script. You can also leave everything else the same and just replace the "-prod" in line 92 with "--prod" in your current pqcxms script. Of course it's also a problem that the newest version crashes for you, I'll have a look at that too.

anani-a-missinou commented 1 year ago

Hi dear @gorges97

The ligne 92 of pqcxms already contain --prod. I ran pqcxms v5.2.1 on files previously generated by qcxms v5.2.0 (conda version) because when I try to run qcxms v5.2.1, it does not generate qcxms.Mspec.xtb2 and qcxms.Mspec.tbxtb files and xtb.last file generated only contain only this information ""[Info] Starting GFN2-xTB calculation in tblite""", suggesting steps for equilibration is not ended. I hope the script of qcxms is the same between the 5.2.0 and 5.2.1 versions

Now pqcxms gives two files tmpqcxms.res and tmpqcxms.out, added as *.txt files. Awesome! tmpqcxms.res.txt tmpqcxms.out.txt

Now I'm trying to compare of calculated spectrum to the experimental spectrum (formatted as intensity in the first column and m/z in the second column). I try the command plotms -e m321.107.csv. The format of CSV is correct? m321.107.csv

He only outputted three files result.csv, results.jdx, mass.agr. Is it possible to export in image format as PNG or TIFF, ...?

Thanks for your help

gorges97 commented 1 year ago

Great! Yes, the only change for the pqcxms script is the change of the "--prod"/"-prod" command, so using qcxms v5.2.0 is fine. Thank you for that info regarding the v5.2.1 in conda, something seems to go wrong there.

Your m321.107.csv file is not in the right data format for plotms. It has to be like the result.csv file: m/z, intensity m/z, intensity ...., .... This is misleadingly stated in the documentation, I have updated it. The mass.agr file can be opened with xmgrace and exported to png from that program.

anani-a-missinou commented 1 year ago

Hi dear @gorges97,

Great, everything is fine. Thank so much !

Is it possible to directly plot the matching score on the comparison graph or at least in the mass.agr file?

I got a score of 0.400, which I think, corresponds to 400 in your article. What are your recommended thresholds for judging unreliable similarities? and tips for improving fragmentation?

In addition, Is there a threshold number of atoms or m/z for which you do not recommend calculating the fragmentation? because some Brassica molecules have m/z up to 1000 Da, as flavonoids are sometimes multiacylated and multi-glycosylated.

Knowing that our experimental spectra come from a Bruker Daltonics UHR-QqTOF Impact II spectrometer, or UPLC-Orbitrap-Q-exactive or Orbitrap Exploris 240 for which the fragmentation experiments were performed using ESI+ and ESI- with argon as a collision gas for collision energy from 20 eV to 40 eV. Can these bellow parameters be correct? and what are your recommendations for amelioration and what are the critical parameters?

For CID (counterpart of ESI+)

cid xtb2 charge 1 elab 40 lchamb 0.25 fullauto

for DEA (which I think is a counterpart of ESI-?)

ei PBE ma-def2-svp charge -1 upper 15 lower 0

Thank you in advance for all of these precisions!

gorges97 commented 1 year ago

Adding the score directly to the mass.agr with plotms is not possible, but you could add the score manually in the mass.agr file. For example in the line "@ string def "M\S+" and add the score where the M+ is printed in the plot.

Yes, this is right, the maximum score printed is 1.0 which corresponds to 1000 in the publications. Usually, a score of 600-700 can be considered to be a really good agreement but the score is very dependent on the peaks with higher m/z values and thus on the relative height of the molecular peak, which can be changed by increasing or decreasing the impact excess energy (keyword ieeatm ). However, I would look more at whether the relevant peaks are predicted and only use the score as a rough guide.

Regarding the system size, I would not go way beyond 100 atoms (including hydrogen atoms), as the computations are getting very expensive for large molecules.

The energy settings of your CID input look good, but I would first try to find good energy settings with fewer trajectories (e.g., "ntraj 100"), as we observed that the correct energy settings can be very system-dependent, especially in CID.

DEA is not the counterpart of ESI-, but of EI- (open shell ionization). The input for ESI- would be: cid xtb2 charge -1 elab 40 lchamb 0.25 fullauto

For your system size, I would recommend using xtb2 first instead of DFT because it is very expensive. But since negatively charged systems are generally more difficult for QM methods, it may be necessary to use DFT.