sirius-ms / sirius

SIRIUS is a software for discovering a landscape of de-novo identification of metabolites using tandem mass spectrometry. This repository contains the code of the SIRIUS Software (GUI and CLI)
GNU Affero General Public License v3.0
88 stars 23 forks source link

Tanimoto Score Calculation Index out of Bound Error #177

Open Fily-coding opened 3 months ago

Fily-coding commented 3 months ago

I do have a problem with calculating similarity matrices, especially the Tanimoto Score. I am currently working with Sirius 5.8.6. and I have a working CLI code for this. I encountered an out of bound error with my similarity calculation some days ago and dont seem to be able to fix it.

I use this command to get the annotations I want and this works just fine. I get all my files in the SIRIUS directory I choose as the output directory. "C:/Program Files/sirius/sirius.exe" -i "//test_directory/data/feature-data.mgf" -o "//test_directory/data/SIRIUS" config --IsotopeSettings.filter=true --FormulaSearchDB= --Timeout.secondsPerTree=0 --FormulaSettings.enforced=HCNOP --Timeout.secondsPerInstance=0 --AdductSettings.detectable=[[M-H2O+H]+,[M+K]+,[M-H]-,[M+Cl]-,[M+Na]+,[M+H3N+H]+,[M+H]+,[M+Br]-,[M-H2O-H]-,[M-H4O2+H]+] --UseHeuristic.mzToUseHeuristicOnly=650 --AlgorithmProfile=orbitrap --IsotopeMs2Settings=IGNORE --MS2MassDeviation.allowedMassDeviation=5.0ppm --NumberOfCandidatesPerIon=1 --UseHeuristic.mzToUseHeuristic=300 --FormulaSettings.detectable=B,Cl,Br,Se,S --NumberOfCandidates=10 --AdductSettings.enforced=, --AdductSettings.fallback=[[M+K]+,[M+Cl]-,[M-H]-,[M+Na]+,[M+H]+,[M+Br]-] --FormulaResultThreshold=true --InjectElGordoCompounds=true --StructureSearchDB=BIO --RecomputeResults=false formula fingerprint structure canopus write-summaries

For the similarity calculation I use this code: "C:/Program Files/sirius/sirius.exe" -i "//test_directory/data/SIRIUS" similarity --numpy --tanimoto --tanimoto-canopus -d "//test_directory/data/similarity"

and then encounter this error, which repeats several times with different job numbers and then I get no output (as expected after these errors): Jul 22, 2024 5:47:13 PM de.unijena.bioinf.jjobs.JJob lambda$logError$2 SEVERE: <27>[JJob-27] Failed! java.lang.ArrayIndexOutOfBoundsException: Index 3878 out of bounds for length 3878 at de.unijena.bioinf.ChemistryBase.fp.ProbabilityFingerprint$PairwiseIterator.getRightProbability(ProbabilityFingerprint.java:295) at de.unijena.bioinf.ms.frontend.subtools.similarity.SimilarityMatrixWorkflow.fpcos(SimilarityMatrixWorkflow.java:359) at de.unijena.bioinf.ms.frontend.subtools.similarity.SimilarityMatrixWorkflow.lambda$tanimoto$9(SimilarityMatrixWorkflow.java:147) at de.unijena.bioinf.ChemistryBase.math.MatrixUtils$1$1.compute(MatrixUtils.java:537) at de.unijena.bioinf.jjobs.BasicJJob.call(BasicJJob.java:117) at de.unijena.bioinf.jjobs.BasicMasterJJob$1.compute(BasicMasterJJob.java:101) at java.base/java.util.concurrent.RecursiveTask.exec(Unknown Source) at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source) at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source) at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source) at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source) at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)

I already tested various things. Like recalculating my .mgf file (has non merged MS/MS data from mzMine that processes my raw data) and recalculated my SIRIUS files. I tried to change the command in different fashions and even tried with the version 6.0.0 (which I didnt manage to get running with my configurations as I wanted and switched back to the previous version)

Do you have any suggestions where the problem may be?

mfleisch commented 2 months ago

Hey, it is unlikely that we are able to provide further bug fixes for SIRIUS 5. So in general I would recommend switching to SIRIUS 6. Since v6.0.4 a lot of initial bugs and hiccups have been resolved. So it likely that your initial issues have been resolved.

Unfortunately the similarity tool has not yet been ported to SIRIUS 6 yet. However we are working on it and it will be available in one the the upcoming minor releases.