faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
78 stars 49 forks source link

Phasing UCE alleles: phyluce_snp_bwa_multiple_align terminating with errors #193

Closed christinethacker closed 3 years ago

christinethacker commented 4 years ago

I'm having trouble generating the multialign-bams in the Phasing UCE data pipeline. It's fine until the Mapping reads against contigs step, when I submit this command:

phyluce_snp_bwa_multiple_align \ --config UCE-Hypseleotris-alleles.conf \ --output multialign-bams \ --cores 4 \ --log-path log \ --mem

It runs for a few files, but does not complete the bam files and kicks out two error files per taxon. The log looks like this:

2020-05-23 15:26:00,109 - phyluce_snp_bwa_multiple_align - INFO - ============ Starting phyluce_snp_bwa_multiple_align ============ 2020-05-23 15:26:00,229 - phyluce_snp_bwa_multiple_align - INFO - Version: git fatal: not a git repository: '/Users/Mercury/anaconda2/envs/phyluce/lib/python2.7/site-packages/.git' 2020-05-23 15:26:00,229 - phyluce_snp_bwa_multiple_align - INFO - Argument --config: /Volumes/Alexandria/UCE-Hypseleotris-alleles/taxon-sets/all/UCE-Hypseleotris-alleles.conf 2020-05-23 15:26:00,229 - phyluce_snp_bwa_multiple_align - INFO - Argument --cores: 4 2020-05-23 15:26:00,229 - phyluce_snp_bwa_multiple_align - INFO - Argument --log_path: /Volumes/Alexandria/UCE-Hypseleotris-alleles/taxon-sets/all/log 2020-05-23 15:26:00,229 - phyluce_snp_bwa_multiple_align - INFO - Argument --mem: True 2020-05-23 15:26:00,229 - phyluce_snp_bwa_multiple_align - INFO - Argument --no_remove_duplicates: False 2020-05-23 15:26:00,229 - phyluce_snp_bwa_multiple_align - INFO - Argument --output: /Volumes/Alexandria/UCE-Hypseleotris-alleles/taxon-sets/all/multialign-bams 2020-05-23 15:26:00,229 - phyluce_snp_bwa_multiple_align - INFO - Argument --subfolder: 2020-05-23 15:26:00,229 - phyluce_snp_bwa_multiple_align - INFO - Argument --verbosity: INFO 2020-05-23 15:26:00,229 - phyluce_snp_bwa_multiple_align - INFO - ============ Starting phyluce_snp_bwa_multiple_align ============ 2020-05-23 15:26:00,231 - phyluce_snp_bwa_multiple_align - INFO - Getting input filenames and creating output directories 2020-05-23 15:26:00,233 - phyluce_snp_bwa_multiple_align - INFO - You are running BWA-MEM 2020-05-23 15:26:00,233 - phyluce_snp_bwa_multiple_align - INFO - ----------- Processing Hypseleotris_HGSxHB_PU09_90HG_1 ---------- 2020-05-23 15:26:00,233 - phyluce_snp_bwa_multiple_align - INFO - Finding fastq/fasta files 2020-05-23 15:26:00,464 - phyluce_snp_bwa_multiple_align - INFO - File type is fastq 2020-05-23 15:26:00,465 - phyluce_snp_bwa_multiple_align - INFO - Building BAM for Hypseleotris_HGSxHB_PU09_90HG_1 2020-05-23 15:32:34,862 - phyluce_snp_bwa_multiple_align - INFO - Cleaning BAM for Hypseleotris_HGSxHB_PU09_90HG_1 2020-05-23 15:32:36,634 - phyluce_snp_bwa_multiple_align - INFO - Adding RG header to BAM for Hypseleotris_HGSxHB_PU09_90HG_1 2020-05-23 15:32:37,632 - phyluce_snp_bwa_multiple_align - INFO - Marking read duplicates from BAM for Hypseleotris_HGSxHB_PU09_90HG_1 2020-05-23 15:32:38,821 - phyluce_snp_bwa_multiple_align - INFO - Building BAM for Hypseleotris_HGSxHB_PU09_90HG_1 2020-05-23 15:32:46,245 - phyluce_snp_bwa_multiple_align - INFO - Cleaning BAM for Hypseleotris_HGSxHB_PU09_90HG_1 2020-05-23 15:32:50,370 - phyluce_snp_bwa_multiple_align - INFO - Adding RG header to BAM for Hypseleotris_HGSxHB_PU09_90HG_1 2020-05-23 15:32:52,361 - phyluce_snp_bwa_multiple_align - INFO - Marking read duplicates from BAM for Hypseleotris_HGSxHB_PU09_90HG_1 2020-05-23 15:32:53,592 - phyluce_snp_bwa_multiple_align - INFO - Merging BAMs for Hypseleotris_HGSxHB_PU09_90HG_1 2020-05-23 15:32:54,393 - phyluce_snp_bwa_multiple_align - INFO - Indexing BAM for Hypseleotris_HGSxHB_PU09_90HG_1 2020-05-23 15:32:54,428 - phyluce_snp_bwa_multiple_align - INFO - --------- Processing Hypseleotris_HBxHX_PU_13_42A_LCG_B4 -------- 2020-05-23 15:32:54,429 - phyluce_snp_bwa_multiple_align - INFO - Finding fastq/fasta files 2020-05-23 15:32:54,455 - phyluce_snp_bwa_multiple_align - INFO - File type is fastq 2020-05-23 15:32:54,457 - phyluce_snp_bwa_multiple_align - INFO - Building BAM for Hypseleotris_HBxHX_PU_13_42A_LCG_B4 2020-05-23 15:37:40,595 - phyluce_snp_bwa_multiple_align - INFO - Cleaning BAM for Hypseleotris_HBxHX_PU_13_42A_LCG_B4 2020-05-23 15:37:42,497 - phyluce_snp_bwa_multiple_align - INFO - Adding RG header to BAM for Hypseleotris_HBxHX_PU_13_42A_LCG_B4 2020-05-23 15:37:43,667 - phyluce_snp_bwa_multiple_align - INFO - Marking read duplicates from BAM for Hypseleotris_HBxHX_PU_13_42A_LCG_B4 2020-05-23 15:37:44,940 - phyluce_snp_bwa_multiple_align - INFO - Building BAM for Hypseleotris_HBxHX_PU_13_42A_LCG_B4 2020-05-23 15:37:52,081 - phyluce_snp_bwa_multiple_align - INFO - Cleaning BAM for Hypseleotris_HBxHX_PU_13_42A_LCG_B4 2020-05-23 15:37:53,825 - phyluce_snp_bwa_multiple_align - INFO - Adding RG header to BAM for Hypseleotris_HBxHX_PU_13_42A_LCG_B4 2020-05-23 15:37:55,034 - phyluce_snp_bwa_multiple_align - INFO - Marking read duplicates from BAM for Hypseleotris_HBxHX_PU_13_42A_LCG_B4 2020-05-23 15:37:56,222 - phyluce_snp_bwa_multiple_align - INFO - Merging BAMs for Hypseleotris_HBxHX_PU_13_42A_LCG_B4 2020-05-23 15:37:57,016 - phyluce_snp_bwa_multiple_align - INFO - Indexing BAM for Hypseleotris_HBxHX_PU_13_42A_LCG_B4 2020-05-23 15:37:57,025 - phyluce_snp_bwa_multiple_align - INFO - --------------- Processing Giuris_margaritacea_H28 -------------- 2020-05-23 15:37:57,025 - phyluce_snp_bwa_multiple_align - INFO - Finding fastq/fasta files 2020-05-23 15:37:57,027 - phyluce_snp_bwa_multiple_align - INFO - File type is fastq 2020-05-23 15:37:57,028 - phyluce_snp_bwa_multiple_align - INFO - Building BAM for Giuris_margaritacea_H28 2020-05-23 15:38:50,299 - phyluce_snp_bwa_multiple_align - INFO - Cleaning BAM for Giuris_margaritacea_H28 2020-05-23 15:39:13,359 - phyluce_snp_bwa_multiple_align - INFO - Adding RG header to BAM for Giuris_margaritacea_H28 2020-05-23 15:39:40,030 - phyluce_snp_bwa_multiple_align - INFO - Marking read duplicates from BAM for Giuris_margaritacea_H28 2020-05-23 15:39:41,992 - phyluce_snp_bwa_multiple_align - INFO - Building BAM for Giuris_margaritacea_H28 2020-05-23 15:39:43,630 - phyluce_snp_bwa_multiple_align - INFO - Cleaning BAM for Giuris_margaritacea_H28 2020-05-23 15:39:45,162 - phyluce_snp_bwa_multiple_align - INFO - Adding RG header to BAM for Giuris_margaritacea_H28 2020-05-23 15:39:47,150 - phyluce_snp_bwa_multiple_align - INFO - Marking read duplicates from BAM for Giuris_margaritacea_H28 2020-05-23 15:39:48,931 - phyluce_snp_bwa_multiple_align - INFO - Merging BAMs for Giuris_margaritacea_H28 Traceback (most recent call last): File "/Users/Mercury/anaconda2/envs/phyluce/bin/phyluce_snp_bwa_multiple_align", line 193, in main() File "/Users/Mercury/anaconda2/envs/phyluce/bin/phyluce_snp_bwa_multiple_align", line 182, in main bam = picard.merge_two_bams(log, sample, sample_dir, bam, bam_se) File "/Users/Mercury/anaconda2/envs/phyluce/lib/python2.7/site-packages/phyluce/picard.py", line 124, in merge_two_bams os.remove(bam) OSError: [Errno 2] No such file or directory: '/Volumes/Alexandria/UCE-Hypseleotris-alleles/taxon-sets/all/multialign-bams/Giuris_margaritacea_H28/Giuris_margaritacea_H28-CL-RG-MD.bam'

One of the error files is attached (they are all similar). I am on MacOS Catalina 10.15.4, to which I recently upgraded. Before the upgrade, the pipeline ran without errors. After the upgrade, phyluce_snp_bwa_multiple_align no longer worked. I reinstalled phyluce (using conda, in its own environment) but that didn't help. I also tried it with both Java 1.7 and 1.8, no difference.

Any advice is welcome!

Thanks,

Christine Thacker

hs_err_pid22735.log

Deco313 commented 4 years ago

Bumping this

jessicadgordon commented 4 years ago

Hello all, I am having this same issue. However, it stops running after the very first error for me

2020-06-11 15:26:46,266 - phyluce_snp_bwa_multiple_align - INFO - Adding RG header to BAM for CE79s_contigs Traceback (most recent call last): File "/usr/local/miniconda3/envs/phyluce/bin/phyluce_snp_bwa_multiple_align", line 193, in main() File "/usr/local/miniconda3/envs/phyluce/bin/phyluce_snp_bwa_multiple_align", line 161, in main bam = picard.add_rg_header_info(log, sample, sample_dir, fc, bam, "pe") File "/usr/local/miniconda3/envs/phyluce/lib/python2.7/site-packages/phyluce/picard.py", line 102, in add_rg_header_info os.remove(bam) OSError: [Errno 2] No such file or directory: '/home/jg18024/multialign-bams/CE79s_contigs/CE79s_contigs-CL.bam'

I also took a look at the log files and they are saying there is an <Exception in thread "main" java.lang.UnsupportedClassVersionError: picard/cmdline/PicardCommandLine : Unsupported major.minor version 52.0>

Screen Shot 2020-06-11 at 4 54 16 PM

Our cluster is currently running openjdk version "1.7.0_91" OpenJDK Runtime Environment (Zulu 7.12.0.3-linux64) (build 1.7.0_91-b15) OpenJDK 64-Bit Server VM (Zulu 7.12.0.3-linux64) (build 24.91-b15, mixed mode)

Thank you, Jess :)

brantfaircloth commented 4 years ago

The Java version you are using looks incorrect for the version of Picard you are using. This should be solved by using the version of Java that comes with phyluce, but perhaps something on your HPC system is overriding.

jessicadgordon commented 4 years ago

Okay thank you Brant, I really appreciate it. I have contacted the support team for our cluster computer about the incompatibility of Java versions. Phyluce is installed on its own separate conda. Hopefully, they can fix this.

jessicadgordon commented 4 years ago

The Java version you are using looks incorrect for the version of Picard you are using. This should be solved by using the version of Java that comes with phyluce, but perhaps something on your HPC system is overriding.

It ended up being an issue with the install of phyluce and having a version of picard that was incompatible with the rest of phyluce. Our cluster manager ended up fixing the issue by using conda to update openjdk. It wanted to "upgrade" openjdk version 1.8.192 down to version 1.8.152. Then the versions matched, and it was able to run.