dfguan / purge_dups

haplotypic duplication identification tool
MIT License
204 stars 19 forks source link

UnboundLocalError: local variable 'workdir' referenced before assignment #29

Open kmkocot opened 4 years ago

kmkocot commented 4 years ago

Hello. I ran purge_dups successfully up through (including) the get_seqs step. See results from my slurm output file below. Both .hap.fa and .purged.fa files were produced but they are smaller than what I was expecting based on redundans and purge_haplotigs. Can you tell me what this error means and how to fix it (or if there is something to be worried about at all)? Thanks!

calculate coverage and self-alignment command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell13_m54071_181121_222304.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell13_m54071_181121_222304.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell15_m54071_181122_185001.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell15_m54071_181122_185001.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell16_m54071_181123_050312.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell16_m54071_181123_050312.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell14_m54071_181122_083636.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell14_m54071_181122_083636.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell17_m54071_181123_151614.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell17_m54071_181123_151614.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell4_m54278_181117_175312.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell4_m54278_181117_175312.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell9_m54071_181119_105514.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell9_m54071_181119_105514.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell3_m54278_181117_073937.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell3_m54278_181117_073937.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell5_m54278_181118_040647.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell5_m54278_181118_040647.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell18_m54071_181124_012926.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell18_m54071_181124_012926.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell1_m54278_181115_203253.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell1_m54278_181115_203253.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell10_m54071_181119_210906.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell10_m54071_181119_210906.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell19_m54071_181124_114235.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell19_m54071_181124_114235.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell21_m54278_181121_222423.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell21_m54278_181121_222423.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell7_m54278_181119_003430.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell7_m54278_181119_003430.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell6_m54278_181118_142025.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell6_m54278_181118_142025.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell22_m54278_181122_083840.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell22_m54278_181122_083840.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell11_m54278_181120_193119.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell11_m54278_181120_193119.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell12_m54278_181121_054502.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell12_m54278_181121_054502.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell8_m54071_181119_004122.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell8_m54071_181119_004122.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell20_m54071_181124_215600.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell20_m54071_181124_215600.paf run successfully command minimap2 -I 4G -x map-pb -t 16 /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta /grps2/kmk/2019-01-16_Quagga_mussel_genome_data_from_Yale_Passamaneck/Quagga_Sequel/Sequel_raw_bam_data/Drb016_cell2_m54278_181116_212527.subreads.samtools_fasta >genome.contigs/coverage/Drb016_cell2_m54278_181116_212527.paf run successfully command /kmk/scripts/purge_dups/bin/pbcstat -O genome.contigs/coverage genome.contigs/coverage/Drb016_cell10_m54071_181119_210906.paf genome.contigs/coverage/Drb016_cell11_m54278_181120_193119.paf genome.contigs/coverage/Drb016_cell12_m54278_181121_054502.paf genome.contigs/coverage/Drb016_cell13_m54071_181121_222304.paf genome.contigs/coverage/Drb016_cell14_m54071_181122_083636.paf genome.contigs/coverage/Drb016_cell15_m54071_181122_185001.paf genome.contigs/coverage/Drb016_cell16_m54071_181123_050312.paf genome.contigs/coverage/Drb016_cell17_m54071_181123_151614.paf genome.contigs/coverage/Drb016_cell18_m54071_181124_012926.paf genome.contigs/coverage/Drb016_cell19_m54071_181124_114235.paf genome.contigs/coverage/Drb016_cell1_m54278_181115_203253.paf genome.contigs/coverage/Drb016_cell20_m54071_181124_215600.paf genome.contigs/coverage/Drb016_cell21_m54278_181121_222423.paf genome.contigs/coverage/Drb016_cell22_m54278_181122_083840.paf genome.contigs/coverage/Drb016_cell2_m54278_181116_212527.paf genome.contigs/coverage/Drb016_cell3_m54278_181117_073937.paf genome.contigs/coverage/Drb016_cell4_m54278_181117_175312.paf genome.contigs/coverage/Drb016_cell5_m54278_181118_040647.paf genome.contigs/coverage/Drb016_cell6_m54278_181118_142025.paf genome.contigs/coverage/Drb016_cell7_m54278_181119_003430.paf genome.contigs/coverage/Drb016_cell8_m54071_181119_004122.paf genome.contigs/coverage/Drb016_cell9_m54071_181119_105514.paf run successfully command /kmk/scripts/purge_dups/bin/calcuts genome.contigs/coverage/PB.stat > genome.contigs/coverage/cutoffs run successfully command /kmk/scripts/purge_dups/bin/split_fa /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta > genome.contigs/split_aln/genome.split.fa run successfully command minimap2 -xasm5 -DP genome.contigs/split_aln/genome.split.fa genome.contigs/split_aln/genome.split.fa > genome.contigs/split_aln/genome.split.paf run successfully purge duplicates command /kmk/scripts/purge_dups/bin/purge_dups -2 -c genome.contigs/coverage/PB.base.cov -T genome.contigs/coverage/cutoffs genome.contigs/split_aln/genome.split.paf > genome.contigs/purge_dups/dups.bed run successfully command /kmk/scripts/purge_dups/bin/get_seqs -p genome.contigs/seqs/genome genome.contigs/purge_dups/dups.bed /kmk/genome_assemblies/genome/2020-03-23_purge_dups_on_canu_assembly_no_pilon_first/genome.contigs.fasta run successfully Traceback (most recent call last): File "/kmk/scripts/purge_dups/scripts/run_purge_dups.py", line 314, in sys.exit(cont(opts.config, opts.bin_dir, opts.spid, opts.pltfm, opts.wait, opts.retries)) File "/kmk/scripts/purge_dups/scripts/run_purge_dups.py", line 283, in cont purged_ref = "{0}/{1}".format(workdir, fasta) UnboundLocalError: local variable 'workdir' referenced before assignment

dfguan commented 4 years ago

Hi, I fixed the bug in the script but this bug was not affecting the results. Could you please show me how the coverage distribution is? with the scripts "hist_plot.py" in purge_dups directory. By this command "python3 hist_plot.py -c cutoffs PB.stat". Dengfeng.