Resolve duplicate base assignment of TEs.

sjteresi / TE_Density

Python script calculating transposable element density for all genes in a genome. Publication: https://mobilednajournal.biomedcentral.com/articles/10.1186/s13100-022-00264-4

GNU General Public License v3.0

28 stars 4 forks source link

Resolve duplicate base assignment of TEs. #31

Closed sjteresi closed 3 years ago

sjteresi commented 4 years ago

Sara Anderson in her maize methylome paper modified a TE annotation using RTrackLayer in R so that each base of the genome could only be assigned to a single TE.

Investigate RTrackLayer and have a discussion with Pat. This may aid in our interpretation steps, so that we are not double counting or needing to do any silly math.

sjteresi commented 3 years ago

Shujun has a system for resolving duplicate base pairs within EDTA, it is just a Perl script. It will require a bed format of TEs. After looking at the script, it is unclear how the annotation naming scheme is kept.

sjteresi commented 3 years ago

Spoke with Adrian and Pat about possible fixes. I had a short presentation with Pat and Adrian about the problem and potential fixes.

@teresi please take a look at this powerpoint just to see the current state of affairs and re-familiarize yourself with the problem.

I am getting suggestions from Pat and Adrian, will keep you updated.

sjteresi commented 3 years ago

@teresi am ready to merge with master, however given the changes ongoing in genedata_cache branch. I would like to integrate that one first (possibly into this branch?). I need the new paths for the verify commands from genedata_cache and I will need to update the True/False usage of a command-line argument in this branch (I will borrow some code from genedata_cache for that).

So given that genedata_cache contains some newer features that I would like to incorporate into this (and I will have to probably add one more small commit on top of that), how should I best handle adding this body of code to master?

sjteresi commented 3 years ago

Reminder that the ulimit needs to be set, possibly discuss with Michael about how to have the user that themselves. I have to set it in every terminal session.