Open maryamghr opened 1 year ago
Hi Maryam,
Thanks for bringing this up. This is not the expected behavior for AdVNTR genotype command on PacBio data. So I expect that there was an error. Could you please share the following?
.log
file? If so, could you please share them?Sara
Hi Sara,
Thanks for following up.
This is the advntr command that I run:
advntr genotype --alignment_file $my_pacbio_bam_file --working_directory log_dir
--pacbio -m vntr_data/hg38_selected_VNTRs_Illumina.db
--outfmt bed -t 30
There is no error message, but the main lines in the log file are like this for each vntr region:
INFO:extract_unmapped_reads_to_fasta_file executed in 0.000037s
INFO:get_filtered_read_ids executed in 0.747395s
INFO:get_vntr_filtered_reads_map executed in 0.747683s
DEBUG:finding repeat count from pacbio alignment file for 201
INFO:length_distribution of unmapped spanning reads: []
DEBUG:no reference positions for read. skipping self.check_if_pacbio_mapped_read_spans_vntr for this read
...
DEBUG:no reference positions for read. skipping self.check_if_pacbio_mapped_read_spans_vntr for this read
INFO:length_distribution of mapped spanning reads: []
INFO:get_spanning_reads_of_aligned_pacbio_reads executed in 0.171241s
INFO:There is no spanning read
INFO:find_repeat_count_from_pacbio_alignment_file executed in 0.208687s
I am using HG002 bam file (PacBio CLR reads aligned with minimap2) from giab. It is a whole-genome alignment file, and it has indeed coverage in VNTR regions. The coverage in the first VNTR region is ~40x for example.
Bests, Maryam
Hi Maryam,
Thanks for sharing the info. I'm working on it.
Sara
On Thu, Aug 31, 2023 at 2:31 AM maryamghr @.***> wrote:
Hi Sara,
Thanks for following up.
- This is the advntr command that I run:
advntr genotype --alignment_file $my_pacbio_bam_file --working_directory log_dir --pacbio -m vntr_data/hg38_selected_VNTRs_Illumina.db --outfmt bed -t 30
- There is no error message, but the main lines in the log file are like this for each vntr region:
INFO:extract_unmapped_reads_to_fasta_file executed in 0.000037s INFO:get_filtered_read_ids executed in 0.747395s INFO:get_vntr_filtered_reads_map executed in 0.747683s DEBUG:finding repeat count from pacbio alignment file for 201 INFO:length_distribution of unmapped spanning reads: [] DEBUG:no reference positions for read. skipping self.check_if_pacbio_mapped_read_spans_vntr for this read ... DEBUG:no reference positions for read. skipping self.check_if_pacbio_mapped_read_spans_vntr for this read INFO:length_distribution of mapped spanning reads: [] INFO:get_spanning_reads_of_aligned_pacbio_reads executed in 0.171241s INFO:There is no spanning read INFO:find_repeat_count_from_pacbio_alignment_file executed in 0.208687s
- I am using HG002 bam file (PacBio CLR reads aligned with minimap2) from giab. It is a whole-genome alignment file, and it has indeed coverage in VNTR regions. The coverage in the first VNTR region is ~40x for example.
Bests, Maryam
— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/mehrdadbakhtiari/adVNTR/issues/64*issuecomment-1700691483__;Iw!!Mih3wA!GwYl_Njdo_6wGBk3fh63y-0WAQfrh4_LOHiK8OXssSPQSSJTRy1fplmquGBMky1C-U1JGCiXFgkrvVWYFUNnQL18bTXcAT9g$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AOGKYDXKQND3SUUJ7EMTWQ3XYBKVDANCNFSM6AAAAAA4C6AUQE__;!!Mih3wA!GwYl_Njdo_6wGBk3fh63y-0WAQfrh4_LOHiK8OXssSPQSSJTRy1fplmquGBMky1C-U1JGCiXFgkrvVWYFUNnQL18bUdHFvUF$ . You are receiving this because you commented.Message ID: @.***>
Salam Sara,
Thank you for working on it! I would like to ask another question in the meantime. Is there any way to use advntr to genotype vntrs given a custom set of regions that are given as a bed file (with an optional column specifying the motif)? So far, I only saw the option for adding a custom vntr region, which needs genomic coordinates of a single region. I want to run it on a bed file including my regions of interest (if possible).
Bests, Maryam
Hi Sara,
Thanks for following up.
- This is the advntr command that I run:
advntr genotype --alignment_file $my_pacbio_bam_file --working_directory log_dir --pacbio -m vntr_data/hg38_selected_VNTRs_Illumina.db --outfmt bed -t 30
- There is no error message, but the main lines in the log file are like this for each vntr region:
INFO:extract_unmapped_reads_to_fasta_file executed in 0.000037s INFO:get_filtered_read_ids executed in 0.747395s INFO:get_vntr_filtered_reads_map executed in 0.747683s DEBUG:finding repeat count from pacbio alignment file for 201 INFO:length_distribution of unmapped spanning reads: [] DEBUG:no reference positions for read. skipping self.check_if_pacbio_mapped_read_spans_vntr for this read ... DEBUG:no reference positions for read. skipping self.check_if_pacbio_mapped_read_spans_vntr for this read INFO:length_distribution of mapped spanning reads: [] INFO:get_spanning_reads_of_aligned_pacbio_reads executed in 0.171241s INFO:There is no spanning read INFO:find_repeat_count_from_pacbio_alignment_file executed in 0.208687s
- I am using HG002 bam file (PacBio CLR reads aligned with minimap2) from giab. It is a whole-genome alignment file, and it has indeed coverage in VNTR regions. The coverage in the first VNTR region is ~40x for example.
Bests, Maryam
Hello, have you solved your problem? How to solve it. I had a similar problem.
I tried adVNTR on PacBio data and the output bed file has only None values in R1 and R2 columns. It basically does not genotype any tandem repeat region.