sjteresi / TE_Density

Python script calculating transposable element density for all genes in a genome. Publication: https://mobilednajournal.biomedcentral.com/articles/10.1186/s13100-022-00264-4
GNU General Public License v3.0
28 stars 4 forks source link

overlap to/from disk #9

Closed teresi closed 4 years ago

teresi commented 4 years ago
sjteresi commented 4 years ago

Instead of Overlap.py::OverlapData did you mean OverlapDataSink? Because the cache and RAM usage statements give me the impression that you want me to refactor the Sink section.

Can you explain how the OverlapDataSink and OverlapData sections mesh together if you do indeed want to/from disk methods inside OverlapData?

It also appears that the datatype is already float32 because of the statement dtype = np.float32 on line 97, and that is then enforced on line 131. Or am I missing something?

teresi commented 4 years ago

Instead of Overlap.py::OverlapData did you mean OverlapDataSink? Because the cache and RAM usage statements give me the impression that you want me to refactor the Sink section.

The OverlapData uses _OverlapDataSink to do that. No, I don't want you to refactor the sink section although there are some missing features right now. I can advise you later on changes to that class.

teresi commented 4 years ago

Can you explain how the OverlapDataSink and OverlapData sections mesh together if you do indeed want to/from disk methods inside OverlapData?

OverlapData contains an _OverlapDataSink. I am still considering the implementation. I'd like to limit _OverlapDataSink to just the writing portion, at this time. It will probably end up having the to/from disk calls in the OverlapData, I don't intend _OverlapDataSink to be part of the interface, rather one would only use it in OverlapData.

teresi commented 4 years ago

It also appears that the datatype is already float32 because of the statement dtype = np.float32 on line 97, and that is then enforced on line 131. Or am I missing something?

The thing you are missing is that line 97 used to be uint32, and, it isnt "enforced" on line 131, it is simply used. I made it a class variable rather than hard coding into the method, that's all. The takeaway from this though is that when I inspected the sum, it was getting close to the limit of uint32, so I changed it.