brentp / bwa-meth

fast and accurate alignment of BS-Seq reads using bwa-mem and a 3-letter genome
https://arxiv.org/abs/1401.1129
MIT License
139 stars 53 forks source link

raise BWAMethException("first run bwameth.py index %s" % fa) #64

Open sengoku93 opened 4 years ago

sengoku93 commented 4 years ago

After running this command: /nfs/sw/bwameth/bwameth-0.2.0/bin/bwameth.py --reference methylation_cfDNA/resources/bwameth/hg19.fa trunc_test/trunc_test.R1.fastq trunc_test/trunc_test.R2.fastq --threads 4 > trunc_test_aligned.bam

And trying multiple versions of python: python/2.7.10 python/2.7.14 python/2.7.8-museq python/3.6.1 python/3.6.7 python/2.7.11 python/2.7.3 python/3.4.5 python/3.6.2 python/3.7.1 python/2.7.11-shared python/2.7.5 python/3.5.1 python/3.6.4 python/ucs4 python/2.7.12 python/2.7.8 python/3.5.2 python/3.6.4.0

I am still getting this error:

Traceback (most recent call last): File "/nfs/sw/bwameth/bwameth-0.2.0/bin/bwameth.py", line 4, in import('pkg_resources').run_script('bwameth==0.2.0', 'bwameth.py') File "/nfs/sw/bwameth/bwameth-0.2.0/lib/python2.7/site-packages/pkg_resources/init.py", line 735, in run_script self.require(requires)[0].run_script(script_name, ns) File "/nfs/sw/bwameth/bwameth-0.2.0/lib/python2.7/site-packages/pkg_resources/init.py", line 1652, in run_script exec(code, namespace, namespace) File "/nfs/sw/bwameth/bwameth-0.2.0/lib/python2.7/site-packages/bwameth-0.2.0-py2.7.egg/EGG-INFO/scripts/bwameth.py", line 450, in main(sys.argv[1:]) File "/nfs/sw/bwameth/bwameth-0.2.0/lib/python2.7/site-packages/bwameth-0.2.0-py2.7.egg/EGG-INFO/scripts/bwameth.py", line 447, in main set_as_failed=args.set_as_failed) File "/nfs/sw/bwameth/bwameth-0.2.0/lib/python2.7/site-packages/bwameth-0.2.0-py2.7.egg/EGG-INFO/scripts/bwameth.py", line 260, in bwa_mem raise BWAMethException("first run bwameth.py index %s" % fa) main.BWAMethException: first run bwameth.py index methylation_cfDNA/resources/bwameth/hg19.fa

Any suggestions?

sengoku93 commented 4 years ago

Seems like reference isn't the issue and generated successfully:

[bwt_gen] Finished constructing BWT in 1308 iterations. [bwa_index] 4514.01 seconds elapse. [bwa_index] Update BWT... 23.31 sec [bwa_index] Pack forward-only FASTA... 22.44 sec [bwa_index] Construct SA from BWT and Occ...

brentp commented 4 years ago

I would say to make sure that methylation_cfDNA/resources/bwameth/hg19.fa (a relative path) actually exists and that all the .c2t and index files exist in the directory.

sengoku93 commented 4 years ago

I provide full path in actual command, just did not want to post publicly. For hg19.fa, do you have an estimate of how big the .c2t files should be?

3.0G Aug 9 15:43 resources/bwameth/hg19.fa 5.9G Aug 14 10:13 resources/bwameth/hg19.fa.bwameth.c2t 5.8G Aug 15 12:43 resources/bwameth/hg19.fa.bwameth.c2t.bwt 1.5G Aug 15 12:44 resources/bwameth/hg19.fa.bwameth.c2t.pac 14K Aug 15 12:44 resources/bwameth/hg19.fa.bwameth.c2t.ann 13K Aug 15 12:44 resources/bwameth/hg19.fa.bwameth.c2t.amb 2.9G Aug 15 13:18 resources/bwameth/hg19.fa.bwameth.c2t.sa

brentp commented 4 years ago

that looks ok. my guess it that there's something odd with how you're specifying the path. maybe the part of the path you have elided has spaces or is incorrect somehow? or maybe it's NFS and not available on your compute node or something like this.

sengoku93 commented 4 years ago

What does NSF mean?

sengoku93 commented 4 years ago

NFS*

brentp commented 4 years ago

networked file system

rmoran7 commented 2 years ago

This is probably resolved. However, I came across this issue and the problem was the timestamp of the .fa was newer than the c2t files. There was no changes in to the ref in my case but moving files after indexing caused the times to change. The code specifically has design for this for good reason, but a better error message would be nice.

MagdalenaWinklhofer commented 1 year ago

Error: "BWAMethException: first run bwameth.py index ... OR bwameth.py index-mem2 ... OR make sure the modification time on the generated c2t files is newer than on the .fa file"

Hi, I encountered the same problem but could not solve it. I am working on an HPC and indexed my genome in a separate slurm script (only done once) and struggle now with the alignment. I made sure that the .fa file is older than the new index files (.c2t) and that they are all stored in the same directory. The indexing worked fine, but I could not get the alignment to work.

"bwameth.py --reference $REF some_R1.fastq.gz some_R2.fastq.gz > some.output.sam"

For the "--reference", I tried to specify the directory of the indexes (relative path), and I gave the program only two sequences (like in the example) to test it. The indexes are not in the same directory as the sequences. Could that be the issue?

My command looks like this: "bwameth.py --reference .../directory_of_the_index/ .../file_R1.fq.gz .../file_R2.fq.gz -t 4 > test_aligned.bam"

Could you let me know what could be wrong or what else I could test to get the program running?

brentp commented 1 year ago

the argument to --reference should be the original .fasta. so e.g.

--reference hg38.fa
ulazcanoCICbioGUNE commented 5 months ago

Good morning, I'm encountering the same error and none of the previous answers worked for me.

"File "/NewCluster_Software/conda_envs/bwameth/bin/bwameth.py", line 349, in bwa_mem
    raise BWAMethException("first run bwameth.py index %s OR bwameth.py index-mem2 %s OR make sure the modification time on the generated c2t files is newer than on the .fa file" % (fa, fa))"

I am also working on an HPC and indexed my genome in a separate slurm script. When trying to do the alignment with a .fastq file I get the error. I have alo checked that the reference genome file .fa.gz is older than the index files and are in the same directory.

Here is the command I'm trying to run:

"bwameth.py --reference /Genomes/Ref_genome/Homo_sapiens.GRCh38.dna.primary_assembly_110.fa.gz /FASTQ/SRR8261134.fastq > test_SRR8261134.bam"

Thanks in advanced for your help!

Uxue

brentp commented 5 months ago

@ulazcanoCICbioGUNE I think you'll need to decompress /Genomes/Ref_genome/Homo_sapiens.GRCh38.dna.primary_assembly_110.fa.gz and then re-run the indexing with bwameth.py index /Genomes/Ref_genome/Homo_sapiens.GRCh38.dna.primary_assembly_110.fa

ulazcanoCICbioGUNE commented 5 months ago

Hello, I have followed your recomendations but the error, even if it has changed a bit, its the same:

Traceback (most recent call last):
  File "/NewCluster_Software/conda_envs/bwameth/bin/bwameth.py", line 563, in <module>
    main(sys.argv[1:])
  File "/NewCluster_Software/conda_envs/bwameth/bin/bwameth.py", line 554, in main
    bwa_mem(args.reference, conv_fqs_cmd, ' '.join(map(str, pass_through_args)),
  File "/NewCluster_Software/conda_envs/bwameth/bin/bwameth.py", line 349, in bwa_mem
    raise BWAMethException("first run bwameth.py index %s OR bwameth.py index-mem2 %s OR make sure the modification time on the generated c2t files is newer than on the .fa file" % (fa, fa))
BWAMethException: first run bwameth.py index /Genomes_Rocky/Ref_genome/Homo_sapiens.GRCh38.dna.primary_assembly_110.fa OR bwameth.py index-mem2 /Genomes_Rocky/Ref_genome/Homo_sapiens.GRCh38.dna.primary_assembly_110.fa OR make sure the modification time on the generated c2t files is newer than on the .fa file

This is how I generated the Index:

[ulazcano@gn12 Ref_genome]$ bwameth.py index-mem2 Homo_sapiens.GRCh38.dna.primary_assembly_110.fa
converting c2t in Homo_sapiens.GRCh38.dna.primary_assembly_110.fa to Homo_sapiens.GRCh38.dna.primary_assembly_110.fa.bwameth.c2t
indexing with bwa-mem2: Homo_sapiens.GRCh38.dna.primary_assembly_110.fa.bwameth.c2t
Looking to launch executable "/NewCluster_Software/conda_envs/bwameth/bin/bwa-mem2.avx512bw", simd = .avx512bw
Launching executable "/NewCluster_Software/conda_envs/bwameth/bin/bwa-mem2.avx512bw"
[bwa_index] Pack FASTA... 28.07 sec
* Entering FMI_search
init ticks = 330480707462
ref seq len = 12399002872
binary seq ticks = 343408854256
build suffix-array ticks = 3739215915280
pos: 1549875360, ref_seq_len__: 1549875359
build fm-index ticks = 1126909408080
Total time taken: 2238.0656

image

Many many thanks for your rapid answere, is my first time working with BS-seq data and I really appreciate your help.

Uxue

brentp commented 5 months ago

you could try:

touch Homo_sapiens.GRCh38.dna.primary_assembly_110.fa.pac
touch Homo_sapiens.GRCh38.dna.primary_assembly_110.fa.amb

to make sure their mod times are newer than the .fa file.

ulazcanoCICbioGUNE commented 5 months ago

I have checked and it seems its all right:

 [ulazcano@gn00 Ref_genome]$ stat Homo_sapiens.GRCh38.dna.primary_assembly_110.fa
  File: Homo_sapiens.GRCh38.dna.primary_assembly_110.fa
  Size: 3151425851      Blocks: 5761668    IO Block: 131072 regular file
Device: 41h/65d Inode: 10961124795734257053  Links: 1
Access: (0777/-rwxrwxrwx)  Uid: (545414668/ulazcano)   Gid: (545403450/gparkaitz)
Access: 2024-01-11 10:22:06.451717469 +0100
Modify: 2024-01-11 10:21:40.446774300 +0100
Change: 2024-01-19 10:10:08.740376916 +0100
 Birth: -
(/vols/GPArkaitz_bigdata/DATA_shared/NewCluster_Software/conda_envs/Bismark) [ulazcano@gn00 Ref_genome]$ stat Homo_sapiens.GRCh38.dna.primary_assembly_110.fa.bwameth.c2t.pac
  File: Homo_sapiens.GRCh38.dna.primary_assembly_110.fa.bwameth.c2t.pac
  Size: 1549875361      Blocks: 2930496    IO Block: 131072 regular file
Device: 41h/65d Inode: 12326907383725557697  Links: 1
Access: (0644/-rw-r--r--)  Uid: (545414668/ulazcano)   Gid: (545403450/gparkaitz)
Access: 2024-01-19 10:13:47.625990115 +0100
Modify: 2024-01-19 10:15:29.022645095 +0100
Change: 2024-01-19 10:15:29.022645095 +0100
 Birth: -
(/vols/GPArkaitz_bigdata/DATA_shared/NewCluster_Software/conda_envs/Bismark) [ulazcano@gn00 Ref_genome]$ stat Homo_sapiens.GRCh38.dna.primary_assembly_110.fa.bwameth.c2t.amb
  File: Homo_sapiens.GRCh38.dna.primary_assembly_110.fa.bwameth.c2t.amb
  Size: 36871           Blocks: 76         IO Block: 131072 regular file
Device: 41h/65d Inode: 13638445480072010433  Links: 1
Access: (0644/-rw-r--r--)  Uid: (545414668/ulazcano)   Gid: (545403450/gparkaitz)
Access: 2024-01-19 10:15:29.412551602 +0100
Modify: 2024-01-19 10:15:29.415484699 +0100
Change: 2024-01-19 10:15:29.415484699 +0100
 Birth: -
brentp commented 5 months ago

you could comment out that check in bwameth.py