sjteresi / TE_Density

Python script calculating transposable element density for all genes in a genome. Publication: https://mobilednajournal.biomedcentral.com/articles/10.1186/s13100-022-00264-4
GNU General Public License v3.0
30 stars 4 forks source link

Refactor revision code to NOT use pandas to_hdf #127

Closed sjteresi closed 1 year ago

sjteresi commented 1 year ago

As requested via text messages and our call on 2/8/2023. I removed the "to_hdf" usage in the Revision code and refactored a general replacement.

All of the tests run and I ran a system test with the Arabidopsis genome.

Additionally, I also added in those files in a commit. They are ~1.5 Mb and we won't change them so I didn't think git lfs was needed.

teresi commented 1 year ago

the tsv files are text so we don't need LFS for that

teresi commented 1 year ago

a) recommend changing class Revise_Anno to class ReviseAnno b) line 81 of revise_annotation.py has a #TODO edit/check

teresi commented 1 year ago

the tests passed for me and the system test (arabidopsis) run in about 8 minutes w/o errors

looks good

the tsv files for the system test are also useful

thank you