kevbrick / genotype_prdm9_LR

Genotype PRDM9 from long read amplicon sequencing
0 stars 0 forks source link

Genotyping from BAM files #1

Open Rashesh7 opened 1 year ago

Rashesh7 commented 1 year ago

Hello,

Thank you for this tool! Your paper on PRDM9 Allelic Variation is quite interesting. I also have some samples sequenced on PacBio. And I was wondering if you have a version of this script that directly takes in the CCS BAM file as input.

Or how are you converting the CCS BAM files to fasta format to use as an input for this script?

Your suggestions are appretiated.

Best Regards, Rashesh

kevbrick commented 1 year ago

Hi Rashesh,

The code currently only accepts FASTA format input reads. To convert the BAM to a FASTA file, you can simply use samtools:

samtools fasta exampleCCS.bam >exampleCCS.fa

Hope that helps. Kevin

Rashesh7 commented 1 year ago

Hi Kevin,

Thank you. I am using the FASTA files now. But when I run the singularity image I get the following warning:

time="2022-10-21T09:18:19+01:00" level=warning msg="\"/run/user/19906\" directory set by $XDG_RUNTIME_DIR does not exist. Either create the directory or unset $XDG_RUNTIME_DIR.: stat /run/user/19906: no such file or directory: Trying to pull image in the event that it is a public image." Warning: [blastn] Query_1 dna_contact_aas T.. : Sequence contains no data

Is this something that is hardcoded in the image? I get an empty *PRDM9_alleles.txt file.

Many Thanks, Rashesh