GATB / DiscoSnp

DiscoSnp is designed for discovering all kinds of SNPs (not only isolated ones), as well as insertions and deletions, from raw set(s) of reads.
https://gatb.inria.fr/software/discosnp/
GNU Affero General Public License v3.0
38 stars 20 forks source link

Using Disco SNP to find sex markers; calling on a cohort with males and females #22

Closed robertwhbaldwin closed 3 years ago

robertwhbaldwin commented 3 years ago

Hi, I was wondering if anyone has used this tool to genotype multiple individuals. I'm trying to use this tool to find a sex marker in a species with no reference genome. It runs fine on a single sample, but how do you do joint genotyping? Do you merge input fasta files together? Do you have to cluster unitigs together after genotyping each sample individually? Thanks - Robert

update: I just realized that the VCF file contains two samples (G1 and G2) as I put the R1 and R2 reads for a single sample on separate lines of the input read.txt file. I think this answers my own question. THere should be a way of writing the read.txt input file to get R1 and R2 for multiple sames in the same VCF and genotypes called per sample.

pierrepeterlongo commented 3 years ago

Hello Robert.

I think you indeed have the answer to your question :) I encourage you to have a look on the documentation (https://github.com/GATB/DiscoSnp/blob/master/doc/discoSnp_user_guide.pdf) section "Input file of file format". This explains several classical use cases (paired reads, several samples, ...)

Best, Pierre