WGLab / LinkedSV

MIT License
20 stars 8 forks source link

Error after candidates file generation #4

Closed cipher2k5 closed 5 years ago

cipher2k5 commented 5 years ago

Analysis stopped during the .candidates file generation. I was running germline SV analysis on barcoded bam.

File "/test/LinkedSV/LinkedSV/linkedsv.py", line 278, in main() File "/test/LinkedSV/LinkedSV/linkedsv.py", line 43, in main detect_increased_fragment_ends(args, dbo_args, endpoint_args) File "/test/LinkedSV/LinkedSV/linkedsv.py", line 209, in detect_increased_fragment_ends find_paired_bk.find_paired_bk(args, dbo_args, endpoint_args) File "/test/LinkedSV/LinkedSV/scripts/find_paired_bk.py", line 705, in find_paired_bk build_graph_from_fragments(args, dbo_args, endpoint_args) File "/test/LinkedSV/LinkedSV/scripts/find_paired_bk.py", line 178, in build_graph_from_fragments clustering_nodes(args, dbo_args, endpoint_args, args.node33_candidate_file, args.node_cluster33_file, max_gap_distance, 'R_end', 'R_end') File "/test/LinkedSV/LinkedSV/scripts/find_paired_bk.py", line 310, in clustering_nodes black_region_key_set = read_black_region_file(args.black_region_bed_file, args.chrname2tid) File "/test/LinkedSV/LinkedSV/scripts/find_paired_bk.py", line 280, in read_black_region_file black_region_fp = open(black_region_bed_file, 'r') IOError: [Errno 2] No such file or directory: '' Picture

fangli80 commented 5 years ago

Hi, The error information showed that black_region_bed_file was not found. We provided blacklist files for human reference genomes.

1) If your samples are human genomes.

Please specify the -v, valid values can be hg19, b37, or hg38, depending on your reference genome. LinkedSV will choose the blacklist file according to the -v value.

If you have specified -v and you still have the issue, please check if the following files exist in the LinkedSV/black_lists folder

alternative_contigs.txt
b37.2D.blacklist.gz
b37_black_list.bed
b37_gap.bed
hg19.2D.blacklist.gz
hg19_black_list.bed
hg19_gap.bed
hg38.2D.blacklist.gz
hg38_black_list.bed
hg38_gap.bed

2) If your samples are not human genomes.

Please provide a blacklist bed file to --black_region_bed. The blacklist file should contain a small set of regions that give consistently spurious signal across samples.

If you don't have a blacklist in hand, you can use an empty file. In this case, the speed of LinkedSV will be much slower.

cipher2k5 commented 5 years ago

Thank you.
I did not include the -v flag as you correctly pointed out. I am re-running the analysis with the -v hg19 flag specified and the analysis is ongoing.