Open sneumann opened 5 years ago
Current thoughts after talking with @adelenelai is that option (1) is likely easiest, I can count at least 6 (and probably more) formats we'll need to work with, so that a new reader for each (option 2) seems impractical - and ideally we'd like to merge where possible and not have 1000s of libraries. This way we could do small converters for every format => MetFrag mb format and keep the efficient internal format that MetFrag needs, then offer users the opportunity to download resulting library files, which they can then specify using OfflineSpectralDatabaseFile=... @MaliRemorker @he-ob @rickhelmus
The class
de.ipbhalle.metfraglib.scoreinitialisation.OfflineMetFusionSpectralSimilarityScoreInitialiser
at https://github.com/ipb-halle/MetFragRelaunched/blob/c57f9d2b406350b2357ce9f7ce42a286cefcca13/MetFragLib/src/main/java/de/ipbhalle/metfraglib/scoreinitialisation/OfflineMetFusionSpectralSimilarityScoreInitialiser.java#L26is used to initialize parameters for the MetFusion-like score which includes the reading of the spectral file
MoNA-export-LC-MS.mb
. If nothing else given in the settings withOfflineSpectralDatabaseFile = ...
This class uses the file located at https://github.com/ipb-halle/MetFragRelaunched/blob/master/MetFragLib/src/main/resources/MoNA-export-LC-MS.mb
As already declared this file is in non-standard format to only include a little information needed by the score. The class located at
de.ipbhalle.metfraglib.peaklistreader.MultipleTandemMassPeakListReade
in https://github.com/ipb-halle/MetFragRelaunched/blob/c57f9d2b406350b2357ce9f7ce42a286cefcca13/MetFragLib/src/main/java/de/ipbhalle/metfraglib/peaklistreader/MultipleTandemMassPeakListReader.java is used to read this file. This creates ade.ipbhalle.metfraglib.collection.SpectralPeakListCollection
which is stored in the global MetFrag settings object later used by the score classde.ipbhalle.metfraglib.score.OfflineMetFusionSpectralSimilarityScore
uses this data to calculate the MetFusion-like score for each candidate.There might be two possibilities now. First, you simply create a new spectral file in the format I used. It's quite simple as it only needs the parameters:
SampleName,InChI,InChIKey,IsPositiveIonMode,PrecursorIonMode,MassError,MSLevel,IonizedPrecursorMass,NumPeaks,MolecularFingerPrint
followed by the spectral data. You can easily figure that out when looking in the default file. This file can then be used by defining its path with
OfflineSpectralDatabaseFile = ...
The used fingerprint function is the MACCSFingerprint included in the CDK implementation.
The second possibility is to define it's own spectral file reader instead of the reader
de.ipbhalle.metfraglib.peaklistreader.MultipleTandemMassPeakListReader
currently used. Here, you could implement a NIST or a MassBank file reader which also needs to create ade.ipbhalle.metfraglib.collection.SpectralPeakListCollection
object. But you need to include the fingerprint of the underlying molecule for each spectrum.Thanks @c-ruttkies for the information! Yours, Steffen