Open slobentanzer opened 5 years ago
Hi Sebastian,
Sorry for late response.
They are essentially generated using bedtools coverage with each bam file being -b and the bed file of regions for each fragment being -a option.
The output is V2, start, V3, end, V4 is the annotation, V5 the score (Can ignore this), V6 the strand and V7 is the number of reads that overlap each bed region. The output of this analysis is found here: tRNA-mapping.dir/{name_of_file}_fragment_coverage.bed.
BW, Adam
Hi Adam,
I got around to checking for the sequences, I used the hg38_cluster.fa to look up the sequences for each fragment in the bed file, is that correct? Like so: as.character(subseq({fasta-file}[{bed}$Chr[i]], bed$Start[i], bed$End[i]))
Is there a way to check if the sequences generated this way for each fragment are actually correct?
And: what does a negative strand mean in this context? Is it the reverse complement? (EDIT: The negative strand fragments do not always have the same count as the positives, it just happens often.)
Kind regards, Sebastian
Hi Sebastian,
I will look into this at the same time as your other issue.
Thanks as always, Best wishes, Adam
hi adam,
i finally had time to look at the results, found another bug (will make separate issue), but got it to work in the end. i am in the QC_alignment.Rmd, and i was wondering what the BED file columns were (they are custom, right?).
what i am basically looking for is the easiest way to get the sequence of each fragment using the coordinates from each BED. so is chromStart = V2, chromEnd = V3? count = V7? which reference should i use, and where do i find it in the folder?
thanks!
sebastian