bcgsc / straglr

Tandem repeat expansion detection or genotyping from long-read alignments
Other
50 stars 9 forks source link

refactor: use dataclass for structured tsv reads #36

Open lavafroth opened 3 months ago

lavafroth commented 3 months ago

Changes

readmanchiu commented 3 months ago

I've been leery of the among of memory used to create an object for every support read instead of a simple tuple.

lavafroth commented 3 months ago

When the code gets compiled to bytecode, the class (struct) has three fields, so the memory consumption should very likely reduce. Also, the slice objects consume the same memory as a tuple and the changed code will improve readability without incurring a performance penalty.

readmanchiu commented 2 months ago
Traceback (most recent call last):
  File "/projects/btl_scratch/rchiu/tmp/straglr/extract_repeats.py", line 7, in <module>
    from dataclasses import dataclass
  File "/home/rchiu/miniconda2/lib/python3.9/dataclasses.py", line 5, in <module>
    import inspect
  File "/home/rchiu/miniconda2/lib/python3.9/inspect.py", line 39, in <module>
    import importlib.machinery
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 786, in exec_module
  File "<frozen importlib._bootstrap_external>", line 881, in get_code
  File "<frozen importlib._bootstrap_external>", line 980, in get_data
lavafroth commented 2 months ago

You might have conflicting modules, maybe a file with the same name. Dataclasses were introduced in Python 3.7