lh3 / psmc

Implementation of the Pairwise Sequentially Markovian Coalescent (PSMC) model
Other
152 stars 60 forks source link

gen_raw_mask.pl not working correctly #38

Open NicMAlexandre opened 3 years ago

NicMAlexandre commented 3 years ago

The faidx for my largest chromosome in the original assembly is as follows:

Chr1 196345723 6 60 61

The faidx for rawMask_35.fa and mask_35_50.fa show the following:

Chr1 5609878 6 60 61

I'm not sure why the size of my chromosome is so drastically reduced and I ave confirmed each previous step finished without error.

The following are my commands:

psmc/utils/./splitfa Sept22Assembly.fasta 35 | split -l 20000000 cat xaa xab xac > xxaa bwa index -a bwtsw genome.fasta bwa aln -R 1000000 -O 3 -E 3 genome.fasta xxaa > xxaa.sai bwa samse genome.fasta xxaa.sai xxaa > aln-se.sam samtools sort -n aln-se.sam -o aln-se_sorted.sam gzip aln-se_sorted.sam gzip -dc aln-se_sorted.sam.gz | perl seqbility-20091110/gen_raw_mask.pl > rawMask_35.fa seqbility-20091110/./gen_mask -l 35 -r 0.5 rawMask_35.fa > mask_35_50.fa

NicMAlexandre commented 3 years ago

We also tried the following with the same result:

psmc/utils/./splitfa genome.fasta 35 | split -l 20000000

mappability_kmers.list contains the following

xaa xab xac

for x in $(cat mappability_kmers.list); do bwa aln -R 1000000 -O 3 -E 3 -t 20 genome.fasta $x > ${x}.sai; done for x in $(cat mappability_kmers.list); do bwa samse genome.fasta ${x}.sai ${x} | samtools sort -o ${x}.bam; done samtools merge x.bam xaa.bam xab.bam xac.bam samtools sort -n x.bam -o x_sorted.bam samtools view -h x_sorted.bam | seqbility-20091110/gen_raw_mask.pl > raw_mask_35.fa seqbility-20091110/gen_mask -l 35 -r 0.5 raw_mask_35.fa > mask_35.fa

head of mask_35.fa

Chr1 5609912 15 60 61 Chr2 4270068 5703441 60 61 Chr3 3265589 10044692 60 61 Chr4 2067153 13364723 60 61 Chr5 1273941 15466344 60 61

eroycroft commented 11 months ago

Hi Nicolas - I know this is an old post, but wondering if you happened to work out the cause or fix for this issue? We're currently having the same problem. Cheers, Emily

NicMAlexandre commented 11 months ago

Hi Emily,

I was not able to sort out the issue. But I plan to use the software again soon and will hopefully find a solution.

On Mon, Oct 23, 2023 at 9:35 PM Emily Roycroft @.***> wrote:

Hi Nicolas - I know this is an old post, but wondering if you happened to work out the cause or fix for this issue? We're currently having the same problem. Cheers, Emily

— Reply to this email directly, view it on GitHub https://github.com/lh3/psmc/issues/38#issuecomment-1776525459, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFB6337MC5CP5ATCZLAUSJLYA5AQBAVCNFSM5FMINPTKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCNZXGY2TENJUGU4Q . You are receiving this because you authored the thread.Message ID: @.***>

-- Best,

Nicolas Alexandre PhD Candidate, Integrative Biology Whiteman Lab University of California - Berkeley @. @.>