DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
718 stars 270 forks source link

kraken2-build: add checkpoints or remove older database first #270

Open Dries-B opened 4 years ago

Dries-B commented 4 years ago

I am currently trying to build Kraken2 on a computing cluster, but one job has timed out and another has failed.

Find the log attached below. It seems to me (although the log is not entirely clear to me, see issue #269) as if steps are repeated in successive commands. Wouldn't it be possible to make use of the progress in earlier attempts? Please consider adding checkpoints to kraken2-build.

Alternatively, for clarity, I would recommend removing an older database first when running kraken2-build and adding this to the log. Perhaps you could include the argument --overwrite for this?

`sbatch --time=3:00:00 --nodes=1 --partition=normal --wrap="kraken2-build --standard --threads 24 --db kraken2_db &>> kraken2-build.log"`
Step 1/2: Performing rsync file transfer of requested files
Rsync file transfer complete.
Step 2/2: Assigning taxonomic IDs to sequences
All files processed, cleaning up extra sequence files... done, library complete.
Masking low-complexity regions of downloaded library... done.
Step 1/2: Performing rsync file transfer of requested files
Rsync file transfer complete.
Step 2/2: Assigning taxonomic IDs to sequences
All files processed, cleaning up extra sequence files... done, library complete.
Masking low-complexity regions of downloaded library...slurmstepd: error: *** JOB CANCELLED AT 2020-06-24T16:43:49 DUE TO TIME LIMIT ***

`sbatch --time=12:00:00 --nodes=1 --partition=normal --wrap="kraken2-build --standard --threads 24 --db kraken2_db &>> kraken2-build.log"`
Step 1/2: Performing rsync file transfer of requested files
Rsync file transfer complete.
Step 2/2: Assigning taxonomic IDs to sequences
All files processed, cleaning up extra sequence files... done, library complete.
Masking low-complexity regions of downloaded library... done.
Step 1/2: Performing rsync file transfer of requested files
Rsync file transfer complete.
Step 2/2: Assigning taxonomic IDs to sequences
All files processed, cleaning up extra sequence files... done, library complete.
Masking low-complexity regions of downloaded library... done.
Step 1/2: Performing rsync file transfer of requested files
Rsync file transfer complete.
Step 2/2: Assigning taxonomic IDs to sequences
All files processed, cleaning up extra sequence files... done, library complete.
Masking low-complexity regions of downloaded library... done.
Step 1/2: Performing rsync file transfer of requested files
Rsync file transfer complete.
Step 2/2: Assigning taxonomic IDs to sequences
All files processed, cleaning up extra sequence files... done, library complete.
Downloading UniVec_Core data from server... done.
Adding taxonomy ID of 28384 to all sequences... done.
Masking low-complexity regions of downloaded library... done.
Creating sequence ID to taxonomy ID map (step 1)...
Sequence ID to taxonomy ID map complete. [0.205s]
Estimating required capacity (step 2)...
Estimated hash table requirement: 48002940340 bytes
Capacity estimation complete. [10m40.846s]
Building database files (step 3)...
build_db: OMP only wants you to use 1 threads
xargs: cat: terminated by signal 13

`sbatch --time=24:00:00 --nodes=1 --partition=normal --wrap="kraken2-build --standard --db kraken2_db &>> kraken2-build.log"`
Step 1/2: Performing rsync file transfer of requested files
Rsync file transfer complete.
Step 2/2: Assigning taxonomic IDs to sequences
All files processed, cleaning up extra sequence files... done, library complete.
Masking low-complexity regions of downloaded library... done.
Step 1/2: Performing rsync file transfer of requested files
Rsync file transfer complete.
Step 2/2: Assigning taxonomic IDs to sequences
All files processed, cleaning up extra sequence files... done, library complete.
Masking low-complexity regions of downloaded library...
isardi commented 4 years ago

Did you figure out what the error means? I am also getting it:

build_db: OMP only wants you to use 1 threads xargs: cat: terminated by signal 13

Any help would be greatly appreciated.

Thanks in advanced. -Isabel

mohhefny commented 3 years ago

I also got the same issue!