Open zhenyu7500 opened 5 days ago
Hi Zhenyu,
You can generate the count_table output to find all counts in all windows and confirm how many counts there are in overlapping windows.
Other possibilities include is extreme GC content and bias or degenerate nucleotides at those sites, very low counts in the input sample making it difficult to estimate enrichment, and high background/broad coverage in the IP samples such that separating enriched peaks is difficult.
Thank you for your help! I have confirmed that the reason for the missing peaks is the lack of corresponding annotations.
I now have the annotation information for the missing intervals, and I would like to ask how to generate the following files: partition.bed, partition.features.tsv, partition.nuc, and accession_type_ranking.txt.
It seems that these files were automatically generated by the software, when I firstly run.
Thank you very much for your help!
zhenyu
It seems that the files I need (partition.bed, partition.features.tsv, and partition.nuc) can be obtained using the parse_gff.R script with the GTF file and accession_type_ranking.txt as inputs. Can I use the accession_type_ranking.txt file provided on your GitHub without any changes?
What is the meaning of the ranking in the accession_type_ranking.txt file? If I am most interested in lncRNA, do I need to move its order to the first position?
thanks for your help!
Best regards,
Zhenyu
Hello,
Overlapping annotations in the GFF are resolved by assigning a top feature_type to the window. The accession type ranking file is the way the top feature_type is determined. All of the feature_types in the GFF must be present in the accession type ranking file.
If you add a new feature_type to the GFF (or directly to any downstream files) then you will need to place it in the accession type ranking file.
You can reorder the rankings in any way you like. Small RNAs are almost universally more abundant than mRNAs, which are almost universally more abundant than lncRNAs in eCLIP libraries, but of course there are always exceptional loci and RNA-binding proteins with very particular types of signal.
Hi, skipper developer,
Thank you for developing skipper!
I have plotted two figures showing the peaks in the region of interest and the read depth of the CLIP data. In the area highlighted in Figure 1, I believe there are additional detectable peaks. Furthermore, the absence of peaks across a large region in Figure 2 has left me quite perplexed.
Thanks for your kindly help!
Best regards,
zhenyu