Open Jokendo-collab opened 9 months ago
To use the tools in the VG toolkit, you would need either genome assemblies or variant calls for the inbred lines. Do you have that available?
I have the illumina fastq files for the F0. I can use that to do variant calling. How is it possible to use this variant files to do a simultaneous haplotype specific transcript quantification?
If you generate a phased VCF file, you can use vg rna
to create a diploid transcriptome and a spliced variation graph. After that, you can use vg mpmap
to map RNA-seq reads to the spliced variation graph. There's an external tool called rpvg
that can then estimate haplotype-resolved expression. The process is described further in this publication.
It's not completely clear to me whether you are thinking about using RNA-seq or genomic DNA to call the variants. As a heads up, using RNA-seq for this process will have lower yield. The problem is that it is difficult to identify the variants when the expression of the associated transcripts is low, which means you have reduced recall on exactly the variants you are interested in (i.e. the ones with strongly haplotype-biased expression).
I want to use RNA-seq data from the F2 [ABCD]
generation. From my reading, the vg rna
needs the transcriptomic and not genomic data. I will give your suggestion a try and see how it goes.
We have a two sets
[A x B]
and[C x D]
of F0 animals which were short read sequenced. The F0 ([A x B] and [C x D] )were repeatedly mated to giveF1
[AB]
and[CD]
. The F1 were then mated to giveF2 [ABCD]
. We now want to identify haplotype specific transcripts for A, B, C, and D in F2 using F2 RNAseq data. How can I go about this? Any suggestion(s).