metageni / FOCUS

FOCUS: An Agile Profiler for Metagenomic Data
GNU General Public License v3.0
8 stars 8 forks source link

[Exception] "has no k-mers count. Probably not valid file." #21

Closed vinisalazar closed 4 years ago

vinisalazar commented 4 years ago

Hi,

I'm trying to run FOCUS on an assembled metagenome reads.

I am getting the following exception:

[2020-02-13 06:16:20,955 - INFO] FOCUS: An Agile Profiler for Metagenomic Data
[2020-02-13 06:16:20,956 - INFO] OUTPUT: test_output_pravaler does not exist - just created it :)
[2020-02-13 06:16:20,956 - INFO] 1) Loading Reference DB
[2020-02-13 06:16:22,441 - INFO] 2) Reference DB was loaded with 2785 reference genomes
[2020-02-13 06:16:22,442 - INFO] 3.1) Working on: 4DHI2c-formatted.fna
[2020-02-13 06:16:22,442 - INFO]    Counting k-mers
Failed to open input file 'kmer_counting_0.5626527946381891021'
Failed to open input file 'kmer_counting_0.562652794638189'
rm: cannot remove 'kmer_counting_0.562652794638189': No such file or directory
Traceback (most recent call last):
  File "/home/vini/anaconda3/envs/focus/bin/focus", line 12, in <module>
    sys.exit(main())
  File "/home/vini/anaconda3/envs/focus/lib/python3.8/site-packages/focus_app/focus.py", line 310, in main
    query_count = normalise(count_kmers(Path(query, temp_query), kmer_size, threads, kmer_order))
  File "/home/vini/anaconda3/envs/focus/lib/python3.8/site-packages/focus_app/focus.py", line 148, in count_kmers
    raise Exception('{} has no k-mers count. Probably not valid file'.format(query_file))
Exception: test/4DHI2c-formatted.fna has no k-mers count. Probably not valid file

However, my FASTA file is perfectly valid. On my previous attempt, in which I got the same error, I noticed lower case sequences in my FASTA file. I thought converting them all to upper case would help, but it still gave me this same mistake. I do have ambiguous nucleotides (N characters) in my reads, perhaps that might be the problem? I tried the FOCUS command with a small file (3 sequences, < 200 bp each), and it worked fine.

Thank you for any assistance you can provide,

V

vinisalazar commented 4 years ago

I thought this was caused by lack of space in my machine to create the temporary files, but checked and it was not the case.

metageni commented 4 years ago

Hi @vinisalazar. Thanks for reporting this.

What is the version of Jellyfish you are using? Jellyfish after a certain version started raising some errors on FOCUS. I need some time to understand what is going on and fix it - it is linked to how I run the tool on FOCUS.

Let me know

Best

vinisalazar commented 4 years ago

Jellyfish version is 2.3.0 - it was installed automatically by conda when installing FOCUS (I created a fresh env with conda create -n focus focus -c bioconda.

metageni commented 4 years ago

gotcha. so please give me some time because I can start investigating what is up. Thanks

vinisalazar commented 4 years ago

thank you very much. I'll stay tuned.

metageni commented 4 years ago

@vinisalazar I think it is the jellyfish. I need to fix it on the focus conda recipe.

Thus, on your conda environment could you please install Jellyfish 2.2.6. and re-try running it.

I will probably fix on FOCUS to use the latest jellyfish and release a new version. Keep me posted.

vinisalazar commented 4 years ago

thx I'll try that and report back

metageni commented 4 years ago

Just checked on bioconda and someone changed the recipe to use jellyfish >= 2.2.6 rather than == 2.2.6

I will change things on my side to use >= 2.2.6.

https://github.com/bioconda/bioconda-recipes/commit/250adf678d69ba08afff8250082fdf44c9a38003

metageni commented 4 years ago

@vinisalazar: Any feedback? thanks

vinisalazar commented 4 years ago

@metageni I apologise but I had to put this on hold for a while. I haven't abandoned it though. Will get back when I have any news.

metageni commented 4 years ago

Hi @vinisalazar, I just ran it with no problems:

conda create -n focus focus -c bioconda conda activate focus focus -q in/ -o output/

[2020-04-02 22:41:16,606 - INFO] FOCUS: An Agile Profiler for Metagenomic Data [2020-04-02 22:41:16,606 - INFO] OUTPUT: output does not exist - just created it :) [2020-04-02 22:41:16,606 - CRITICAL] DATABASE: /Users/geni.silva/miniconda3/envs/focus/lib/python3.8/site-packages/focus_app/db/k6 does not exist. Did you extract db.zip? [2020-04-02 22:41:16,606 - INFO] DATABASE: Uncompressing Database for you :) Archive: /Users/geni.silva/miniconda3/envs/focus/lib/python3.8/site-packages/focus_app/db.zip creating: /Users/geni.silva/miniconda3/envs/focus/lib/python3.8/site-packages/focus_app/db/ inflating: /Users/geni.silva/miniconda3/envs/focus/lib/python3.8/site-packages/focus_app/db/k6 inflating: /Users/geni.silva/miniconda3/envs/focus/lib/python3.8/site-packages/focus_app/db/k7 [2020-04-02 22:41:17,405 - INFO] 1) Loading Reference DB [2020-04-02 22:41:18,622 - INFO] 2) Reference DB was loaded with 2785 reference genomes [2020-04-02 22:41:18,623 - INFO] 3.1) Working on: hiseq_reads_ecoli_R2.fastq [2020-04-02 22:41:18,623 - INFO] Counting k-mers [2020-04-02 22:41:25,099 - INFO] Running FOCUS [2020-04-02 22:41:26,016 - INFO] 5) Writing Results to output [2020-04-02 22:41:26,018 - INFO] 5.1) Working on Kingdom [2020-04-02 22:41:26,021 - INFO] 5.2) Working on Phylum [2020-04-02 22:41:26,024 - INFO] 5.3) Working on Class [2020-04-02 22:41:26,027 - INFO] 5.4) Working on Order [2020-04-02 22:41:26,030 - INFO] 5.5) Working on Family [2020-04-02 22:41:26,035 - INFO] 5.6) Working on Genus [2020-04-02 22:41:26,043 - INFO] 5.7) Working on Species [2020-04-02 22:41:26,056 - INFO] 5.8) Working on Strain [2020-04-02 22:41:26,080 - INFO] Done

vinisalazar commented 4 years ago

Hi @metageni thank you for remembering.

I did a fresh install but I'm afraid the error still stands. I'm working to see if the problems are in my input files (although that would be unlikely).

I'll report back if I have any news.

Best wishes, V

metageni commented 4 years ago

@vinisalazar This should fix it. It is a matter of bioconda push it to the code repository

metageni commented 4 years ago

I have released it outside of biconda (https://github.com/metageni/FOCUS/releases/tag/1.5)

metageni commented 4 years ago

@vinisalazar merged on biconda. Please update the FOCUS version and try again.

https://github.com/bioconda/bioconda-recipes/pull/21293

vinisalazar commented 4 years ago

It works! Thank you :smile:

linsalrob commented 4 years ago

Just to confirm, I had the same issue, but easy to resolve:

Reinstall a new version of focus from conda:

conda remove --name focus --all
conda create -n focus -c bioconda focus

Worked without issue.