Closed jmbadia closed 3 years ago
Hi @jmbadia (sorry for the late reply), yes, would be great if you could add these info and make a pull request
thanks!
great !
Retaking the issue... I can not parse adduct=PRECURSOR TYPE and msLevel= SPECTRUM TYPE fields. Such terms are only analogous under MSn spectrometry. The best option here is to keep the original field names and indicate that the user can rename those columns to adduct and msLevel respectively. what do you think @jorainer?
sounds good to me. I did something similar for parsing/accessing the MassBank database. The percursor m/z is not always a numeric in MassBank, sometimes it is a character string or has multiple values. I'm thus storing the precursor m/z in a field called "precursor_mz_text"
as it is retrieved from MassBank (as a text string) and in addition convert it to a numeric with as.numeric
and use that as precursorMz
- it will be correct for those entries that have a numeric precursor m/z but will have NA
for all others (in which case the user can still get the original information from the $precursor_mz_text
spectra variable.
Perfect. I'll replicate your MassBank solution :)
link to pull request #81 and issue #80
MoNa provides the
adduct
(fieldPRECURSOR TYPE
) andmsLevel
(fieldSPECTRUM TYPE
) for MS2 spectra (issue #30). I think it would be a good idea to add these field to.extract_spectra_mona_sdf()
. I'll do it if that is ok for youPlease also note that
smiles
andsplash
don't have a particular field on the sdf file, but they appear on a regular basis (>99.88% of the negative MS/MS sdf file) in theCOMMENT
field (along with other variables, separated by a "__" character). It seems that somehow the algorithm that converts the data to the sdf format put consciously all the available variables in theCOMMENT
field. Maybe it would be a good idea to parsesmiles
andsplash
.sample of COMMENT file: "SMILES=c1c(cc(c(c1Cl)n2c(c(c(n2)C#N)S(=O)C(F)(F)F)N)Cl)C(F)(F)F cas=120068-37-3 chebi=83394 kegg=C11099 pubchem cid=3352 chemspider=3235 InChI=InChI=1S/C12H4Cl2F6N4OS/c13-5-1-4(11(15,16)17)2-6(14)8(5)24-10(22)9(7(3-21)23-24)26(25)12(18,19)20/h1-2H,22H2 __ computed SMILE ....."