brentp / bwa-meth

fast and accurate alignment of BS-Seq reads using bwa-mem and a 3-letter genome
https://arxiv.org/abs/1401.1129
MIT License
139 stars 53 forks source link

could not make reference genome #25

Closed aheravi closed 8 years ago

aheravi commented 8 years ago

Hi, I am trying to create reference genome from hg38 and getting following error:

/bwameth.py index hg38_no_alt.fa converting c2t in hg38_no_alt.fa to hg38_no_alt.fa.bwameth.c2t indexing: hg38_no_alt.fa.bwameth.c2t [bwa_index] Pack FASTA... 53.61 sec [bwa_index] Reverse the packed sequence... 19.25 sec [bwa_index] Construct BWT for the packed sequence... TextLengthFromBytePacked(): text length > 2^32! cmd was:bwa index -a bwtsw hg38_no_alt.fa.bwameth.c2t Traceback (most recent call last): File "/bwa-meth-0.10/bwameth.py", line 601, in main(sys.argv[1:]) File "/bwa-meth-0.10/bwameth.py", line 548, in main sys.exit(bwa_index(convert_fasta(args[1]))) File "/bwa-meth-0.10/bwameth.py", line 151, in bwa_index run("bwa index -a bwtsw %s" % fa) File "/bwa-meth-0.10/bwameth.py", line 60, in run list(nopen("|%s" % cmd.lstrip("|"))) File "/anaconda/lib/python2.7/site-packages/toolshed-0.4.0-py2.7.egg/toolshed/files.py", line 53, in process_iter raise ProcessException(cmd) toolshed.files.ProcessException: bwa index -a bwtsw hg38_no_alt.fa.bwameth.c2t

Any idea on resolving the issue? Thanks, Ali

brentp commented 8 years ago

are you on a 32 bit machine?

and do you have the most recent version of bwa?

aheravi commented 8 years ago

Hi brentp, No, I am on a 64 one. Version: 0.5.7 (r1310)

aheravi commented 8 years ago

Same problem using bwa Version: 0.7.6a-r433

cmd was:bwa index -a bwtsw hg38_no_alt.fa.bwameth.c2t Traceback (most recent call last): File "/bwa-meth-0.10/bwameth.py", line 601, in main(sys.argv[1:]) File "/bwa-meth-0.10/bwameth.py", line 548, in main sys.exit(bwa_index(convert_fasta(args[1]))) File "/bwa-meth-0.10/bwameth.py", line 151, in bwa_index run("bwa index -a bwtsw %s" % fa) File "/bwa-meth-0.10/bwameth.py", line 60, in run list(nopen("|%s" % cmd.lstrip("|"))) File "/anaconda/lib/python2.7/site-packages/toolshed-0.4.0-py2.7.egg/toolshed/files.py", line 53, in process_iter raise ProcessException(cmd) toolshed.files.ProcessException: bwa index -a bwtsw hg38_no_alt.fa.bwameth.c2t

brentp commented 8 years ago

can you try running this directly bwa index -a bwtsw hg38_no_alt.fa.bwameth.c2t

aheravi commented 8 years ago

I actually did but still getting index error when trying to align my dataset:

Traceback (most recent call last): File "/bwa-meth-0.10/bwameth.py", line 601, in main(sys.argv[1:]) File "/bwa-meth-0.10/bwameth.py", line 586, in main set_as_failed=args.set_as_failed) File "/bwa-meth-0.10/bwameth.py", line 246, in bwa_mem raise BWAMethException("first run bwameth.py index %s" % fa) main.BWAMethException: first run bwameth.py index /bwameth_genome_hg38/

Files under my genome folder: hg38_no_alt.fa hg38_no_alt.fa.bwameth.c2t.ann hg38_no_alt.fa.bwameth.c2t.rpac hg38_no_alt.fa.bwameth.c2t hg38_no_alt.fa.bwameth.c2t.bwt hg38_no_alt.fa.bwameth.c2t.sa hg38_no_alt.fa.bwameth.c2t.amb hg38_no_alt.fa.bwameth.c2t.pac

brentp commented 8 years ago

when you run bwameth, you need to send it the path to the fasta file. It appears from the error message that you are sending the path to the directory that contains it. If you continue to have problems. Please post the full invocation and the full error message.

aheravi commented 8 years ago

Thanks, that has been resolved. Now move to the next error:

[main] unrecognized command 'mem' Traceback (most recent call last): File "/bwa-meth-0.10/bwameth.py", line 601, in main(sys.argv[1:]) File "/bwa-meth-0.10/bwameth.py", line 586, in main set_as_failed=args.set_as_failed) File "/bwa-meth-0.10/bwameth.py", line 259, in bwa_mem as_bam(cmd, fa, prefix, calmd, set_as_failed) File "/bwa-meth-0.10/bwameth.py", line 293, in as_bam raise Exception("bad or empty fastqs") Exception: bad or empty fastqs

The fastqs are not "bad or empty"! I already aligned those with bismark and novoalign.

brentp commented 8 years ago

Please post the full invocation and the full error message.

aheravi commented 8 years ago

FQ1=s_1_1_001.fastq FQ2=s_1_2_001.fastq REFERENCE=hg38_no_alt.fa

/bwa-meth-0.10/bwameth.py --threads 16 --prefix $PREFIX --reference $REFERENCE $FQ1 $FQ2 >run_bwaMeth.log 2>&1

cat run_bwaMeth.log running: bwa mem -T 40 -B 2 -L 10 -CM -U 100 -p -R '@RG ID:s_1__001 SM:s_1__001' -t 16 hg38_no_alt.fa.bwameth.c2t '<anaconda/bin/python /bwa-meth-0.10/bwameth.py c2t s_1_1_001.fastq s_1_2_001.fastq' writing to: samtools view -bS - | samtools sort - a34002_bwaMeth [main] unrecognized command 'mem' Traceback (most recent call last): File "/bwa-meth-0.10/bwameth.py", line 601, in main(sys.argv[1:]) File "/bwa-meth-0.10/bwameth.py", line 586, in main set_as_failed=args.set_as_failed) File "/bwa-meth-0.10/bwameth.py", line 259, in bwa_mem as_bam(cmd, fa, prefix, calmd, set_as_failed) File "/bwa-meth-0.10/bwameth.py", line 293, in as_bam raise Exception("bad or empty fastqs") Exception: bad or empty fastqs

brentp commented 8 years ago
[main] unrecognized command 'mem'

you need a newer version of bwa mem.