shilpagarg / DipAsm

MIT License
75 stars 18 forks source link

DipAsm pipeline #30

Open sumoii opened 1 year ago

sumoii commented 1 year ago

Dear Shilpa Garg: The paper "Chromosome-scale, haplotype-resolved assembly of human genomes" described there was a step that grouping and ordering contigs into scaffolds with Hi-C data using 3D-DNA .
“We mapped Hi-C reads to contigs with BWA-MEM v.0.7.17 and scaffolded the Peregrine contigs with juicer v.1.5 and 3D-DNA v.180922. We preprocessed data with ‘juicer.sh -d juicer -p chrom.sizes -y cut-sites.txt -z contigs.fa -D’, where file ‘cut-sites.txt’ was generated using the generate_site_positions_Arima.py script, which outputs merged_nodups.txt. The scaffolds were produced with ‘run-asm-pipeline.sh -m haploid contigs.fa merged_nodups.txt’. We then called small variants using DeepVariant v.0.8.0 with the pretrained PacBio model” But I noticed that pipeline.sh
(https://github.com/shilpagarg/DipAsm/pipeline.sh) has skipped this step. Could you tell me if there are something wrong or my errant understanding? Looking forward to your reply.