NBISweden / aMeta

Ancient microbiome snakemake workflow
MIT License
19 stars 15 forks source link

issue with krakenuniq output when running on KrakenUniq full NCBI NT database #136

Open CjeicaM opened 9 months ago

CjeicaM commented 9 months ago

The krakenuniq.output file contains a newline CR character inserted in each line after the rank column, therefore the pipeline fails after that step. This does not happen with the microbial version of the database. The workflow was run on our HPC cluster (Rocky Linux environment with slurm), two different installations of krakenuiq (1.0.0 and 1.0.4) were tested.

clami66 commented 9 months ago

Hi @CjeicaM that's an interesting issue, thanks for being in touch about it.

From the logs that @LeandroRitter showed me:

Error in rule KrakenUniq_AbundanceMatrix:
    jobid: 15
    input: results/KRAKENUNIQ/RS105.RISE505.MA873_L1_S2.trim/krakenuniq.output.filtered

Looks like the issue is directly caused by the results/KRAKENUNIQ/RS105.RISE505.MA873_L1_S2.trim/krakenuniq.output.filtered file. Could you attach this file from a run that failed?

Thanks

CjeicaM commented 8 months ago

Hi, sorry for the delay. I am attaching both the results/KRAKENUNIQ/RS105.RISE505.MA873_L1_S2.trim/krakenuniq.output.filtered file and results/KRAKENUNIQ/RS105.RISE505.MA873_L1_S2.trim/krakenuniq.output file. I think the fact that the filtered file is erroneous is due to the occurrence of the carriage return characters in krakenuniq.output file. krakenuniq.output.txt krakenuniq.output.filtered.txt

NikolayOskolkov commented 8 months ago

@CjeicaM did you by any chance use Mobaexterm when you discovered the CR inserted to the KrakenUniq report file (or when running aMeta at HPC)?

clami66 commented 7 months ago

@CjeicaM could you do as described in issue #137 , specifically by replacing the taxDB file in your full NT database as described here ?

CjeicaM commented 7 months ago

Hi,

I try doing it in the next couple of days, and I'll let you know about the results.

Maciej

czw., 9 lis 2023 o 09:09 Claudio Mirabello @.***> napisał(a):

@CjeicaM https://github.com/CjeicaM could you do as described in issue

137 https://github.com/NBISweden/aMeta/issues/137 , specifically by

replacing the taxDB file in your full NT database as described here https://github.com/NBISweden/aMeta/issues/137#issuecomment-1786805440 ?

— Reply to this email directly, view it on GitHub https://github.com/NBISweden/aMeta/issues/136#issuecomment-1803339428, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3OBK43WD7MM6PS55GNXW4LYDSFTZAVCNFSM6AAAAAA5JV5LDGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBTGMZTSNBSHA . You are receiving this because you were mentioned.Message ID: @.***>

CjeicaM commented 7 months ago

Hi, This seemed to resolve the issue. The pipeline is still running but its already past the krakenuniq step, it did not report any error, and both krakenuniq.output and krakenuniq.output.filtered look fine (run through MobaXterm on Windows machine, using screen and srun on Linux-based HPC).

LeandroRitter commented 1 week ago

@clami66 the issue is back, one user who recently downloaded the database reported the ^M symbol in the krakenuniq.output. Did you upload the corrected taxDB to scilifelab figshare?