Closed Anto007 closed 6 months ago
If I look into the individual sample cov.stats files as below, is it correct that the last column represents the mean coverage values? Is it possible to also find somewhere in the output the info for the number of reads mapped here? Many thanks in advance @jpuritz @pdimens
head Sample1.cov.stats
comp5 0 439 2
comp6 0 67 0
comp10 0 505 2
comp12 3 388 5
comp14 18 314 21
comp22 0 298 13
comp24 0 355 4
comp27 0 408 0
comp29 0 365 0
comp31 0 392 6
This is the output of bedtools coverage -b Sample1-RG.bam -a mapped.bed -counts -sorted -g genome.file > Sample1.cov.stats
It's bed
format:
contig
start coordinate
end coordinate
count of reads that overlap with the interval
Hopefully, Sample1
is just an example name, but you should be following the dDocent naming convention PopulationIdentifier_SampleIdentifier
. Also, dDocent is not designed for transcriptomics.
@jpuritz Many thanks for your prompt response-much appreciated! Yes, Sample1 is just an example name and I'm following the dDocent naming convention. My input data is from ezRADseq. I notice that there's also cov.stats
and cov.split.stats
available in my results directory. Do they represent just the mean number of read counts across all the samples that were analyzed?
No. cov.stats is the sum. Cov.split.stats is used for creating SNP calling intervals. It can be ignored.
Jon Puritz, PhD (he/him)
Associate Professor Department of Biological Sciences University of Rhode Island 120 Flagg Road, Kingston, RI 02881
Webpage: MarineEvoEco.com
Cell: 401-338-8739 Work: 401-874-9020
The University of Rhode Island occupies the traditional stomping ground of the Narragansett Nation and the Niantic People.
On Mon, May 06, 2024 at 10:08 AM, Jant007 @.***> wrote:
@jpuritz https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_jpuritz&d=DwMCaQ&c=dWz0sRZOjEnYSN4E4J0dug&r=YjPLxZU-wfPfb3H1Y34afw&m=OOfnyMstyno8JyT7TjP1XMbwI4bDKej1jJubadYEB6puaxHaCC0fDh_5xvZ3z0Yj&s=zEGvi7NM5iWAuE-Ws4mcq4nIzR4n6SZT9c86BlvRXqg&e= Many thanks for your prompt response-much appreciated! Yes, Sample1 is just an example name and I'm following the dDocent naming convention. My input data is from ezRADseq. I notice that there's also cov.stats and cov.split.stats available in my results directory. Do they represent just the mean number of read counts across all the samples that were analyzed?
— Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_jpuritz_dDocent_issues_90-23issuecomment-2D2096111674&d=DwMCaQ&c=dWz0sRZOjEnYSN4E4J0dug&r=YjPLxZU-wfPfb3H1Y34afw&m=OOfnyMstyno8JyT7TjP1XMbwI4bDKej1jJubadYEB6puaxHaCC0fDh_5xvZ3z0Yj&s=CRMtfL2MqMkcc6Gse6rtimjvaP7N_lEb3-5t_m9dCB8&e=, or unsubscribe https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ABE5CR3B7DGC3EIZWID5GK3ZA6FHPAVCNFSM6AAAAABHIZAHGOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJWGEYTCNRXGQ&d=DwMCaQ&c=dWz0sRZOjEnYSN4E4J0dug&r=YjPLxZU-wfPfb3H1Y34afw&m=OOfnyMstyno8JyT7TjP1XMbwI4bDKej1jJubadYEB6puaxHaCC0fDh_5xvZ3z0Yj&s=Zs2pq6013MbOrgzQ3j4EAK-Lt6EeKHV5O6spFFpDrfs&e= . You are receiving this because you were mentioned.Message ID: @.***>
Thank you very much again
Hi @jpuritz @pdimens
Great pipeline and thank you for bringing this out to the community! I would appreciate it very much if you could let me know where in the dDocent output I can find information such as the number of reads mapped to the reference transcriptome for each individual, mean coverage, etc?