Open colindaven opened 1 year ago
Hi Colin,
Thanks for the report. Are you trying to plot from explore
or search
? Plot only really works for search
output. Two more things:
Cheers, M
Hi Max, thanks for the quick reply.
Its definitely tidk search
, here's the nextflow code. I'm using the plant canonical telomere, which works well for most assemblies.
This is the latest release AFAIK.
tidk_ubuntu_0.2.31 search --string $params.telomere --output ${prefix} --dir . --extension tsv $fasta
tidk_ubuntu_0.2.31 plot --tsv ${prefix}_telomeric_repeat_windows.tsv
mv tidk-plot.svg ${prefix}_plot.svg
Typical screenshot of the plot
Here example data for the first part of chr1 (here in bedgraph format).
Chr01 0 10000 1411
Chr01 10000 20000 777
Chr01 20000 30000 0
Chr01 30000 40000 0
Chr01 40000 50000 0
Chr01 50000 60000 0
Chr01 60000 70000 0
Chr01 70000 80000 0
Chr01 80000 90000 0
Chr01 90000 100000 0
Chr01 100000 110000 2
Chr01 110000 120000 7
Chr01 120000 130000 3
Chr01 130000 140000 2
Chr01 140000 150000 2
Chr01 150000 160000 1
Chr01 160000 170000 0
And the full tsv, renamed as csv for github
Thanks for this, super helpful. I'll check the file you have given later but I am pretty sure this is the correct behaviour. I could implement a log y-scale if that would be helpful!
Maybe, I was also thinking about just displaying more the chromosome ends in a distorted x scale, since they are what is interesting here.
Or creating a simple heatmap in python of just the chromosome ends vs selected "background" from the chromatin.
Hi,
I've been trying tidk search, explore and then plotting with tidk plot.
Data on the plots are barely visible. Perhaps a log scale would be more effective ? I'm not sure if the counts I have are just too low relative to the putative telomere counts, or if the whole graph is scaled so I can't see much.
Counts - example - from a recent public genome, fairly typical for my genomes. Telomeres have roughly counts of 1000 copies, intrachromosomal 10 to 200.
This is a summary file for the top 60 lines of a file sorted by Telomere_count, then sorted by Chr and then by Start. I'm using the full tsv for the plot step of course.