Closed ezecalvo closed 2 years ago
Hello,
Since we are not very familiar to bsub
, we don't know why this happened. Here is one thing we believed could cause this problem.hisat-3n-build
write temporary files when building the hisat-3n-index. If bsub
run multiple hisat-3n-build
and they write information to the same temporary file, it could cause error. Could you open one host and one task for hisat-3n-build
with -p 30
multithreading?
Best, Leo
Hi,
Thanks for the fast response.
I'm running with one host and task. Do you have any command you use for queuing systems? It could be sbatch or something similar, just want to translate it into bsub.
I should also mention that using a smaller genome it works just fine in ~4hrs. I did this using just chromosome 1 from the fasta file for example.
Hello,
I just run hisat-3n-build
on a 256GB memory cluster, and it looks OK. Here is my script:
./hisat-3n/hisat-3n-build --base-change T,C -p 30 --ss ../data/reference/genome.ss --exon ../data/reference/genome.exon ../data/reference/genome.fa ../tmp/hisat-3n_genome
Could you pull and make the newest hisat-3n
and try it again? The graph index building process may use more than 100GB memory. You should see many temporary files with .rf
suffix in your output directory after 10min of the index building started.
Thanks, Leo
Hi,
That didn't work!
I made it work (either using bsub or not) when not using -ss and --exon. I checked the obvious like chromosome names being consistent in the fasta file and ss/exon and that looks fine!
This is how the ss and exon files look like:
hg38.ss
1 12056 12178 + 1 12226 12612 + 1 12696 12974 + 1 12720 13220 + 1 13051 13220 + 1 13373 13452 + 1 14500 15004 - 1 15037 15795 - 1 15946 16606 - 1 16764 16857 -
hg38.exon
1 11868 12226 + 1 12612 12720 + 1 12974 13051 + 1 13220 14500 + 1 15004 15037 - 1 15795 15946 - 1 16606 16764 - 1 16857 17054 -
Hello,
I tested the hisat-3n-build
with --ss
, --exon
, and -p 30
option. It takes about 2 hours to finish the building process without error. To help you troubleshooting, could you tell me the link that you downloaded hg38.fa
, hg38.ss
, and hg38.exon
? Then we can test on our side.
Best, Leo
Hi,
Here are the files: https://www.dropbox.com/sh/fu901c8p79x5y15/AADNS3pSAHWYFJws4bdnBE85a?dl=0
I built hg38.ss and hg38.exon following the instructions in hisat2 manual.
Thanks!
Hello,
I just checked your file. Your hg38.ss
and hg38.exon
have some redundant information at the end of file. You can use tail -n 50 hg38.ss
to check the extra information. It looks related to your the job submission output. Could you re-build the hg38.ss
and hg38.exon
file then build the hisat-3n-index?
Thanks, Leo
Ouch, I'm deeply sorry for such a silly mistake. Just removed it and the index is building normally. Thanks a lot for the patience and the help!
Hi,
I'm using hisat3n to build a hg38 index using an LSF cluster that has been running for 20 days without any changes in the log file. Is this normal? What could be the problem? My files work fine when building an index in hisat2.
My code using 30 threads and 10gb memory for each:
bsub -q long -n 30 -R rusage[mem=10000] -R span[hosts=1] -W 720:00 hisat-3n-build --base-change T,C -p 30 --ss hg38.ss --exon hg38.exon hg38.fa hg38_hisat3n/hisat3_genome
My log file:
Thanks!