sirius-ms / sirius

SIRIUS is a software for discovering a landscape of de-novo identification of metabolites using tandem mass spectrometry. This repository contains the code of the SIRIUS Software (GUI and CLI)
GNU Affero General Public License v3.0
78 stars 17 forks source link

java.io.IOException: Database have to be a directory of .csv xor .json files #30

Closed nirshahaf closed 2 years ago

nirshahaf commented 3 years ago

Hi,

Trying to structure annotate vs. a local DB (since online connection to the CSI server is still jammed), I get the above error msg. - however the database is the direct result of applying the 'custom-db' stand-alone and contains numerous '.json.gz' files. I'm running now Sirius version 4.6.0.

A side issue - when running the 'custom-db' on a local structure DB of around 350K compounds I get a library folder with 57619 files. In contrary, when running the same 'custom-db' command but with the added option of uniting it with some other DBs:

--derive-from="COCONUT,KNAPSACK,CHEBI,KEGG,GNPS,PLANTCYC,SUPERNATURAL"

I get a new folder with exactly one additional file and very similar size.

Any additional info about how to properly run CSI with an internal DB and how to gen. unified custom-DBs would be welcome!

Thanks, `

mfleisch commented 3 years ago

Hey,

Trying to structure annotate vs. a local DB (since online connection to the CSI server is still jammed), I get the above error msg.

  1. What do you mean with CSI server connection is still jammed? I cannot see any problems for 4.6.0.
  1. Please note that using a custom DB, just gives you the possibility to have custom candidate lists, it does not allow for offline search since the predictions are still done by our servers.

  2. Regarding Database have to be a directory of .csv xor .json files It usually caused by a wrong custom DB name or Path. There are 2 possibilities here:

    • Either you have stored the DB outside the SIRIUS config directory (by giving a absolute path during custom-db execution), then you have to reference the absolute path when using the database as search DB for CSI:FingerID.
    • Or you have just given an custom DB name during custom-db execution, then it is stored inside the SIRIUS config directory. In that case you have to use the DB name when using the database as search DB for CSI:FingerID.
  3. --derive-from will not add additional files to the custom DB directory, it just sets a flag that the candidates of derived DBs will be included in the search results when using the custom DB. All candidates of the derived DBs will also be flagged with the name of the custom DB in the result list.

Any additional info about how to properly run CSI with an internal DB and how to gen. unified custom-DBs would be welcome! Can you specify what kind of information would help you? I think you already know this piece of documentation? https://boecker-lab.github.io/docs.sirius.github.io/cli-standalone/#custom-database-tool

nirshahaf commented 3 years ago
  1. What do you mean with CSI server connection is still jammed? I cannot see any problems for 4.6.0.

It used to work fine ntil about two weeks ago the connection just blocks - I sent the detail in an earlier post. It might be a local problem from our side - can you perhaps send me the IP/port numbers so I can check if some internal security policy is blocking the connection to CSI server?

Or you have just given an custom DB name during custom-db execution, then it is stored inside the SIRIUS config directory. In that case you have to use the DB name when using the database as search DB for CSI:FingerID.

That in fact is exactly what I did and I confirmed that the database was created in that directory and with DB name as the folder name. I run the CLI with both the absolute path and the custom-db name: the former case gave the above error, while the latter simply stopped with:

java.lang.IllegalArgumentException: No search DB given!

The online page explains nicely how to prepare the input file - but in my opinion details are missing regarding what actions are being performed (to avoid misunderstanding of the process) and an actual run example (which can be in the Sirius help page).

mfleisch commented 3 years ago

Hey,

I sent the detail in an earlier post.

Do you mean this one? Hmm, none of the log outputs looks like a connection issue. What ILP solver are you using? The integrated one?

can you perhaps send me the IP/port numbers so I can check if some internal security policy is blocking the connection to CSI server.

This is the URL for SIRIUS version 4.6.x: https://www.csi-fingerid.uni-jena.de:8443/v1.4.8/ You should be redirected to the swagger-ui which gives you the possibility to create REST queries for testing.

That in fact is exactly what I did and I confirmed that the database was created in that directory and with DB name as the folder name.

Can you please post the exact command you used to create the custom database and the command you used for the CSI:FingeriID search in the created DB? I will see if I can reproduce the issue.

The online page explains nicely how to prepare the input file - but in my opinion details are missing regarding what actions are being performed (to avoid misunderstanding of the process) and an actual run example (which can be in the Sirius help page).

OK, I will add a few lines to the doc regarding your suggestions.