awiedemer / PRACS24_class_project

0 stars 0 forks source link

run_bracken not working - unknown issue #2

Open awiedemer opened 2 months ago

awiedemer commented 2 months ago

@jelmerp

Hi Jelmer,

I've written a script to run backen which 1.) sets up bracken with bracken-build 2.) runs bracken on the kraken2 outputs

I've followed the bracken guidelines (https://github.com/jenniferlu717/Bracken) to the best of my ability for writing my scripts, but when I run my run_bracken script (after running all of the previous scripts in my runner file) I am met with the following output with the error at the bottom (in bold) which I cannot quite understand.

# Starting script run_bracken.sh
Thu Apr 25 16:43:34 EDT 2024
# kraken2 database dir:          ./kraken2/database/
# k-mers:                        35
# read length:                   75

# kraken2 location:              ./conda/kraken2/

# kraken2 results dir:           ./kraken2/kraken_outputs/
# kraken2 reports dir:           ./kraken2/kraken_reports/
# taxa level:                    S
# number of reads before
# abundance estimation
# to perform re-estimation:      10

# output results dir:            ./bracken/bracken_outputs
# output report dir:             ./bracken/bracken_reports
./conda/bracken already exists.
 >> Selected Options:
       kmer length = 35
       read length = 75
       database    = ./kraken2/database/
       threads     = 20
       kraken type = kraken2
 >> Checking for Valid Options...
 >> Creating database.kraken [if not found]
      >> kraken2 --db ./kraken2/database --threads 20 <( find -L ./kraken2/database/library \( -name *.fna -o -name *.fa -o -name *.fasta \) -exec cat {} + ) > ./kraken2/database/database.kraken

*Loading database information... done. /fs/ess/PAS2700/users/awiedemer673/class_project/conda/bracken/bin/bracken-build: line 187: 86707 Killed ${KINSTALL}kraken2 --db $DATABASE --threads ${THREADS} <( find -L $DATABASE/library ( -name ".fna" -o -name ".fa" -o -name ".fasta" ) -exec cat {} + ) > $DATABASE/database.kraken find: ‘cat’ terminated by signal 13 slurmstepd: error: Detected 1 oom_kill event in StepId=28235758.batch. Some of the step tasks have been OOM Killed.**

The script used to run this code should be available in my scripts folder.

Could you put me in the right direction for how to fix this? Thank you!!

jelmerp commented 2 months ago

Hi Aaron,

This type of error...

slurmstepd: error: Detected 1 oom_kill event in StepId=28235758.batch. Some of the step tasks have been OOM Killed.

...means that the process (bracken-build) tried to use more memory than it had available (OOM = out-of-memory). You did ask for 20 cores = 80 GB but this process can need a lot of memory indeed. In a script that I have for bracken-build, I am asking for 125 GB, and it may need more or less depending on the size of the Kraken DB which I guess it will load in memory. So perhaps you can try with however much you used for your Kraken job, assuming that's more than 80 GB. At any rate, you'll need to increase the amount.

Let me know if you can get this to work!


On another note, I think you are doing too many things in this run_bracken.sh script. I would have separate scripts for bracken-build and bracken proper, and then only run bracken for 1 single Kraken report in the latter script: i.e., with no loop inside the batch job script, but only in the runner script.

awiedemer commented 2 months ago

Thanks! So far so good now

From: Jelmer W. Poelstra @.> Date: Friday, April 26, 2024 at 10:45 AM To: awiedemer/PRACS24_class_project @.> Cc: Aaron Max Wiedemer @.>, Author @.> Subject: Re: [awiedemer/PRACS24_class_project] run_bracken not working - unknown issue (Issue #2) Hi Aaron, This type of error. . . slurmstepd: error: Detected 1 oom_kill event in StepId=28235758. batch. Some of the step tasks have been OOM Killed. .. . means that the process (bracken-build) tried to use more memory than it had available (OOM

Hi Aaron,

This type of error...

slurmstepd: error: Detected 1 oom_kill event in StepId=28235758.batch. Some of the step tasks have been OOM Killed.

...means that the process (bracken-build) tried to use more memory than it had available (OOM = out-of-memory). You did ask for 20 cores = 80 GB but this process can need a lot of memory indeed. In a script that I have for bracken-build, I am asking for 125 GB, and it may need more or less depending on the size of the Kraken DB which I guess it will load in memory. So perhaps you can try with however much you used for your Kraken job, assuming that's more than 80 GB. At any rate, you'll need to increase the amount.

Let me know if you can get this to work!


On another note, I think you are doing too many things in this run_bracken.sh script. I would have separate scripts for bracken-build and bracken proper, and then only run bracken for 1 single Kraken report in the latter script: i.e., with no loop inside the batch job script, but only in the runner script.

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/awiedemer/PRACS24_class_project/issues/2*issuecomment-2079537537__;Iw!!KGKeukY!0f9mtfWSObuuT25icOF86FZtvOXs1VP9hN_S20DxlwLC7z888S1ITAdFnsx_o5tmmvPBkCqQU_r_hpswynH7JqySAw$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/A7KQBL57ZSVTXX5YPVPJX3LY7JR7DAVCNFSM6AAAAABGZTPXGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZZGUZTONJTG4__;!!KGKeukY!0f9mtfWSObuuT25icOF86FZtvOXs1VP9hN_S20DxlwLC7z888S1ITAdFnsx_o5tmmvPBkCqQU_r_hpswynGQYSI6Nw$. You are receiving this because you authored the thread.Message ID: @.***>