robotoD / GenoVi

GenoVi, an automated customizable circular genome visualizer for bacteria and archaea
Other
85 stars 11 forks source link

Error in circos.svg #21

Open arghya1611 opened 3 weeks ago

arghya1611 commented 3 weeks ago

Hi Thanks for developing GenoVi. Looks a great tool but I was not able to use it. Please see the error message I got below.

(/data/Food/analysis/R1479_neurofoods/genovi) [arghya.mukherjee@compute07 genovi-test]$ genovi -i ../GenoVi/input_test/Corynebacterium_alimapuense_VA37.gbk -o test --status complete --colour_scheme autumn --verbose yes /data/Food/analysis/R1479_neurofoods/genovi/lib/python3.7/site-packages/Bio/GenBank/Scanner.py:1219: BiopythonParserWarning: Premature end of file in sequence data "Premature end of file in sequence data", BiopythonParserWarning

GBK file transformed into faa succesfully. File saved as test-temp/contig_1-test.faa output test-temp/contig_1-test

Deepnog prediction started

deepnog infer test-temp/contig_1-test.faa --out test-temp/contig_1-test_prediction_deepnog.csv -db cog2020 -t 1 [2024-10-31 23:35:31] deepnog.client.client - INFO - Starting deepnog [2024-10-31 23:35:32] deepnog.client.client - INFO - Loading NN-parameters from /home/arghya.mukherjee/deepnog_data/cog2020/1/deepnog.pth ... [2024-10-31 23:35:32] deepnog.client.client - INFO - Accessing dataset from test-temp/contig_1-test.faa ... [2024-10-31 23:35:32] deepnog.client.client - INFO - Starting protein sequence group/family inference ... [2024-10-31 23:35:32] deepnog.learning.inference - INFO - Inference device: cpu deepnog inference: 273seq [00:05, 47.4seq/s] [2024-10-31 23:35:38] deepnog.learning.inference - INFO - Inference complete. [2024-10-31 23:35:38] deepnog.client.client - INFO - Writing prediction to test-temp/contig_1-test_prediction_deepnog.csv [2024-10-31 23:35:38] deepnog.client.client - INFO - All done.

Deepnog prediction finished succesfully. Predictions saved as test-temp/contig_1-test_prediction_deepnog.csv

/data/Food/analysis/R1479_neurofoods/genovi/lib/python3.7/site-packages/Bio/GenBank/Scanner.py:1219: BiopythonParserWarning: Premature end of file in sequence data "Premature end of file in sequence data", BiopythonParserWarning test-temp/contig_1-test_bands.kar created succesfully. Traceback (most recent call last): File "/data/Food/analysis/R1479_neurofoods/genovi/bin/genovi", line 8, in sys.exit(main()) File "/data/Food/analysis/R1479_neurofoods/genovi/lib/python3.7/site-packages/scripts/GenoVi.py", line 685, in main visualiseGenome(*get_args()) File "/data/Food/analysis/R1479_neurofoods/genovi/lib/python3.7/site-packages/scripts/GenoVi.py", line 369, in visualiseGenome change_background("none", False) File "/data/Food/analysis/R1479_neurofoods/genovi/lib/python3.7/site-packages/scripts/GenoVi.py", line 47, in change_background file = open(fileName) FileNotFoundError: [Errno 2] No such file or directory: 'circos.svg'

The error is consistent for your input tests gbk or mine. I would really want to use the tool some help would be nice. Regards

arodel21 commented 3 weeks ago

Hi @arghya1611! Thanks for using our tool and reaching out with your question.

The warning BiopythonParserWarning: Premature end of file in sequence data indicates that the file is expected to be longer. Could you please that the end of the file is as expected? Sometimes BioPython has issues with old annotation tools.

Also, what do the predictions and test-temp/contig_1-test_bands.kar files look like? Are they empty?

The No such file or directory: 'circos.svg' error could mean that GenoVi is not processing your files due to ill-format or that the Circos installation is not recognized. Could you please check your Circos installation with which circos?

Let us know how this goes!

arghya1611 commented 3 weeks ago

Hey @arodel21

Thanks for your prompt response - much appreciated! I have answered your queries below. The results are from a run with the GBK files in the GenoVi/input_test/Brevibacterium_Genomes/ folder on GitHub. FYI, genovi is installed as a conda environment.

The original command was: genovi -i GenoVi/input_test/Brevibacterium_Genomes/GCA_000426445.gbk -o test --status complete --colour_scheme autumn --verbose yes


The warning BiopythonParserWarning: Premature end of file in sequence data indicates that the file is expected to be longer. Could you please that the end of the file is as expected? Sometimes BioPython has issues with old annotation tools.

As mentioned above, the GBK file used is one provided with the GitHub repo. However, I faced the same issues with GBK files called with the latest version of prokka. Can you tell me if there is anything in particular that I should look for in the file as it looks fine to me. Also, see below that the when --status draft this warning is gone. Does Genovi expect --status draft when there are multiple contigs in a GBK file? I have a file with multiple contigs but all are complete as in 1 chromosome and multiple plasmids.

Also, what do the predictions and test-temp/contig_1-test_bands.kar files look like? Are they empty?

No its not empty. See the contents below. I checked the other temp files. All are with data except the rRNA files.

(/data/Food/analysis/R1479_neurofoods/genovi) [arghya.mukherjee@compute07 genovi-results]$ more genovi-test/test-temp/contig_1-test_bands.kar 
chr - chr1 1 0 1026773 black
band chr1 band01 band01 0 1026773 white

The No such file or directory: 'circos.svg' error could mean that GenoVi is not processing your files due to ill-format or that the Circos installation is not recognized. Could you please check your Circos installation with which circos?

(/data/Food/analysis/R1479_neurofoods/genovi) [arghya.mukherjee@compute07 test-temp]$ /data/Food/analysis/R1479_neurofoods/genovi/bin/circos I have checked the circos version in the conda env. Looks okay as per requirements given here.

I reran the same GBK file with the status changed to draft. It ran better, but the final error remains.

(/data/Food/analysis/R1479_neurofoods/genovi) [arghya.mukherjee@compute07 genovi-results]$ genovi -i GenoVi/input_test/Brevibacterium_Genomes/GCA_000426445.gbk -o test-results-1 --status draft --colour_scheme autumn --verbose yes

GBK file transformed into faa succesfully. File saved as test-results-1-temp/test-results-1.faa
output test-results-1-temp/test-results-1

Deepnog prediction started

deepnog infer test-results-1-temp/test-results-1.faa --out test-results-1-temp/test-results-1_prediction_deepnog.csv -db cog2020 -t 1
[2024-11-02 21:37:23] deepnog.client.client - INFO - Starting deepnog
[2024-11-02 21:37:32] deepnog.client.client - INFO - Loading NN-parameters from /home/arghya.mukherjee/deepnog_data/cog2020/1/deepnog.pth ...
/data/Food/analysis/R1479_neurofoods/genovi/lib/python3.8/site-packages/deepnog/client/client.py:324: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  model_dict = torch.load(weights_path, map_location=args.device)
[2024-11-02 21:37:32] deepnog.client.client - INFO - Accessing dataset from test-results-1-temp/test-results-1.faa ...
[2024-11-02 21:37:32] deepnog.client.client - INFO - Starting protein sequence group/family inference ...
[2024-11-02 21:37:32] deepnog.learning.inference - INFO - Inference device: cpu
deepnog inference: 3.54kseq [00:53, 65.5seq/s]
[2024-11-02 21:38:26] deepnog.learning.inference - INFO - Inference complete.
[2024-11-02 21:38:26] deepnog.client.client - INFO - Writing prediction to test-results-1-temp/test-results-1_prediction_deepnog.csv
[2024-11-02 21:38:27] deepnog.client.client - INFO - All done.

Deepnog prediction finished succesfully. Predictions saved as test-results-1-temp/test-results-1_prediction_deepnog.csv

test-results-1-temp/test-results-1_bands.kar created succesfully.
Traceback (most recent call last):
  File "/data/Food/analysis/R1479_neurofoods/genovi/bin/genovi", line 8, in <module>
    sys.exit(main())
  File "/data/Food/analysis/R1479_neurofoods/genovi/lib/python3.8/site-packages/scripts/GenoVi.py", line 685, in main
    visualiseGenome(*get_args())
  File "/data/Food/analysis/R1479_neurofoods/genovi/lib/python3.8/site-packages/scripts/GenoVi.py", line 579, in visualiseGenome
    change_background("none")
  File "/data/Food/analysis/R1479_neurofoods/genovi/lib/python3.8/site-packages/scripts/GenoVi.py", line 47, in change_background    file = open(fileName)
FileNotFoundError: [Errno 2] No such file or directory: 'circos.svg'

The message is already too long, and I really appreciate your response. But I felt I must mention that Genovi would really benefit from a more extensive Tutorial explaining the nuances of the software. I like Genovi as its a good alternative to Circos which is a tad clunky to configure and use, so would be happy to see Genovi work out for everyone!