sjteresi / TE_Density

Python script calculating transposable element density for all genes in a genome. Publication: https://mobilednajournal.biomedcentral.com/articles/10.1186/s13100-022-00264-4
GNU General Public License v3.0
28 stars 4 forks source link

operands could not be broadcast together with shapes (2,) (11891,) #124

Closed edinatale closed 2 months ago

edinatale commented 1 year ago

Hi,

after pre-processing my input data and running process_genome.py, I get this error:

process     : 100%|████████████████████████████| 29/29 [00:21<00:00,  2.23s/it]2022-11-08 09:28:06 erica-tes _ProgressBars[298379] ERROR failed to process gene 'path/env/out/filtered_input_data/input_h5_cache/sp_chr_01_GeneData.h5':  operands could not be broadcast together with shapes (2,) (11891,) t/s]
process     : 100%|████████████████████████████| 29/29 [00:21<00:00,  1.35it/s]
genes       :  60%|████████████▌        | 10297/17227 [00:21<00:14, 479.30it/s]
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "path/env/TE_Density/transposon/overlap_manager.py", line 98, in _process_overlap_job
    result = _calculate_overlap_job(job)
  File "path/env/TE_Density/transposon/overlap_manager.py", line 76, in _calculate_overlap_job
    file = overlap.calculate(
  File "path/env/TE_Density/transposon/overlap.py", line 454, in calculate
    sink.intra[out_slice] = Overlap.intra(gene_datum, transposons)
  File "path/env/TE_Density/transposon/overlap.py", line 83, in intra
    lower_bound = np.minimum(g_stop, transposons.stops)
  File "path/env/lib/python3.8/site-packages/pandas/core/generic.py", line 2101, in __array_ufunc__
    return arraylike.array_ufunc(self, ufunc, method, *inputs, **kwargs)
  File "path/env/lib/python3.8/site-packages/pandas/core/arraylike.py", line 397, in array_ufunc
    result = getattr(ufunc, method)(*inputs, **kwargs)
ValueError: operands could not be broadcast together with shapes (2,) (7820,) 

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/data/TEs/Repeatmasker2010/TEdensity_tool/env/TE_Density/transposon/overlap_manager.py", line 101, in _process_overlap_job
    os.path.remove(result.overlap_file)
AttributeError: module 'posixpath' has no attribute 'remove'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "path/env/TE_Density/process_genome.py", line 298, in <module>
    overlap_results = overlap_mgr.calculate_overlap()
  File "path/env/TE_Density/transposon/overlap_manager.py", line 280, in calculate_overlap
    pool.map(_process_overlap_job, todo)  # blocks execution
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 771, in get
    raise self._value
AttributeError: module 'posixpath' has no attribute 'remove'

Any suggestion about what's going on?

Thanks!

sjteresi commented 1 year ago

Hello,

Can you please share the following?

sjteresi commented 1 year ago

Hello,

Any update on this?

janina-rinke commented 3 months ago

Hi! I encountered the exact same error. Here is my error:

process     : 100%|██████████████████████████| 150/150 [00:03<00:00, 45.32it/s].err 
genes       :  83%|█████████████████▍   | 9177/11076 [00:03<00:00, 2772.29it/s]
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/j/j_rink02/software/TE_Density/transposon/overlap_manager.py", line 98, in _process_overlap_job
    result = _calculate_overlap_job(job)
  File "/home/j/j_rink02/software/TE_Density/transposon/overlap_manager.py", line 76, in _calculate_overlap_job
    file = overlap.calculate(
  File "/home/j/j_rink02/software/TE_Density/transposon/overlap.py", line 454, in calculate
    sink.intra[out_slice] = Overlap.intra(gene_datum, transposons)
  File "/home/j/j_rink02/software/TE_Density/transposon/overlap.py", line 83, in intra
    lower_bound = np.minimum(g_stop, transposons.stops)
  File "/home/j/j_rink02/.local/lib/python3.10/site-packages/pandas/core/generic.py", line 2113, in __array_ufunc__
    return arraylike.array_ufunc(self, ufunc, method, *inputs, **kwargs)
  File "/home/j/j_rink02/.local/lib/python3.10/site-packages/pandas/core/arraylike.py", line 402, in array_ufunc
    result = getattr(ufunc, method)(*inputs, **kwargs)
ValueError: operands could not be broadcast together with shapes (2,) (3996,) 

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Applic.HPC/Easybuild/zen3/2022b/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/Applic.HPC/Easybuild/zen3/2022b/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/home/j/j_rink02/software/TE_Density/transposon/overlap_manager.py", line 101, in _process_overlap_job
    os.path.remove(result.overlap_file)
AttributeError: module 'posixpath' has no attribute 'remove'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/j/j_rink02/software/TE_Density/process_genome.py", line 305, in <module>
    overlap_results = overlap_mgr.calculate_overlap()
  File "/home/j/j_rink02/software/TE_Density/transposon/overlap_manager.py", line 280, in calculate_overlap
    pool.map(_process_overlap_job, todo)  # blocks execution
  File "/Applic.HPC/Easybuild/zen3/2022b/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/multiprocessing/pool.py", line 367, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/Applic.HPC/Easybuild/zen3/2022b/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/multiprocessing/pool.py", line 774, in get
    raise self._value
AttributeError: module 'posixpath' has no attribute 'remove'

My python version is Python 3.10.9

My system: LSB Version: :core-4.1-amd64:core-4.1-noarch Distributor ID: CentOS Description: CentOS Linux release 7.9.2009 (Core) Release: 7.9.2009 Codename: Core

Could you help me with this issue?

sjteresi commented 3 months ago

Hi

Can you please share the output of pip freeze or some other way to verify that your Python packages have the correct version? I ask because I encountered a similar issue with a collaborator that tried to install things via Conda which led to the incorrect versions. If I recall correctly, it was an issue with a minor package version number.

Finally, can you please verify that everything works when you run make system_test in the root directory of the repository?

janina-rinke commented 3 months ago

Hi Scott, Thank you for getting back to me.

I installed TE_Density using git clone as described in the installation instructions. I ran make system_testand everything seemed to work perfectly well. This is the last message I get when running it: INFO process density... complete.

Finally, this is the output of my pip freezecommand:

alabaster @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/alabaster/alabaster-0.7.12
appdirs @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/appdirs/appdirs-1.4.4
asn1crypto @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/asn1crypto/asn1crypto-1.5.1
atomicwrites @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/atomicwrites/atomicwrites-1.4.1
attrs @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/attrs/attrs-22.1.0
Babel @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/Babel/Babel-2.11.0
backports.entry-points-selectable @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/backportsentrypointsselectable/backports.entry_points_selectable-1.2.0
backports.functools-lru-cache @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/backportsfunctools_lru_cache/backports.functools_lru_cache-1.6.4
bcrypt @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/bcrypt/bcrypt-4.0.1
bitstring @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/bitstring/bitstring-3.1.9
black==23.3.0
blist @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/blist/blist-1.3.6
CacheControl @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/CacheControl/CacheControl-0.12.11
cachy @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/cachy/cachy-0.3.0
certifi @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/certifi/certifi-2022.9.24
cffi @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/cffi/cffi-1.15.1
chardet @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/chardet/chardet-5.0.0
charset-normalizer @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/charsetnormalizer/charset-normalizer-2.1.1
cleo @ file:///Applic.HPC/Easybuild/sources/p/Python/extensions/cleo-1.0.0a5-py3-none-any.whl
click @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/click/click-8.1.3
clikit @ file:///Applic.HPC/Easybuild/sources/p/Python/extensions/clikit-0.6.2-py2.py3-none-any.whl
cloudpickle @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/cloudpickle/cloudpickle-2.2.0
colorama @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/colorama/colorama-0.4.6
coloredlogs==15.0.1
commonmark @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/commonmark/commonmark-0.9.1
contourpy==1.0.6
crashtest @ file:///Applic.HPC/Easybuild/sources/p/Python/extensions/crashtest-0.3.1-py3-none-any.whl
cryptography @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/cryptography/cryptography-38.0.3
cycler==0.11.0
Cython @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/Cython/Cython-0.29.32
decorator @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/decorator/decorator-5.1.1
distlib @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/distlib/distlib-0.3.6
docopt @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/docopt/docopt-0.6.2
docutils @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/docutils/docutils-0.19
doit @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/doit/doit-0.36.0
dulwich @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/dulwich/dulwich-0.20.50
ecdsa @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/ecdsa/ecdsa-0.18.0
editables @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/editables/editables-0.3
exceptiongroup==1.0.4
execnet @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/execnet/execnet-1.9.0
filelock @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/filelock/filelock-3.8.0
flit @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/flit/flit-3.8.0
flit_core @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/flit_core/flit_core-3.8.0
flit_scm @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/flit_scm/flit_scm-1.7.0
fonttools==4.38.0
fsspec @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/fsspec/fsspec-2022.11.0
future @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/future/future-0.18.2
glob2 @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/glob2/glob2-0.7
h5py==3.8.0
hatch-fancy-pypi-readme @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/hatch_fancy_pypi_readme/hatch_fancy_pypi_readme-22.8.0
hatch-vcs @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/hatch_vcs/hatch_vcs-0.2.0
hatchling @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/hatchling/hatchling-1.11.1
html5lib @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/html5lib/html5lib-1.1
humanfriendly==10.0
idna @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/idna/idna-3.4
imagesize @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/imagesize/imagesize-1.4.1
importlib-metadata @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/importlib_metadata/importlib_metadata-5.0.0
importlib-resources @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/importlib_resources/importlib_resources-5.10.0
iniconfig @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/iniconfig/iniconfig-1.1.1
intervaltree @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/intervaltree/intervaltree-3.1.0
intreehooks @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/intreehooks/intreehooks-1.0
ipaddress @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/ipaddress/ipaddress-1.0.23
jaraco.classes @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/jaracoclasses/jaraco.classes-3.2.3
jeepney @ file:///Applic.HPC/Easybuild/sources/p/Python/extensions/jeepney-0.8.0-py3-none-any.whl
Jinja2 @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/Jinja2/Jinja2-3.1.2
joblib @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/joblib/joblib-1.2.0
jsonschema @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/jsonschema/jsonschema-4.17.0
keyring @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/keyring/keyring-23.11.0
keyrings.alt @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/keyringsalt/keyrings.alt-4.2.0
kiwisolver==1.4.4
liac-arff @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/liacarff/liac-arff-2.5.0
lockfile @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/lockfile/lockfile-0.12.2
MarkupSafe @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/MarkupSafe/MarkupSafe-2.1.1
matplotlib==3.6.2
mock @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/mock/mock-4.0.3
more-itertools @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/moreitertools/more-itertools-9.0.0
msgpack @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/msgpack/msgpack-1.0.4
mypy-extensions==0.4.3
netaddr @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/netaddr/netaddr-0.8.0
netifaces @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/netifaces/netifaces-0.11.0
numexpr==2.8.4
numpy==1.24.2
packaging==23.1
pandas==1.5.2
paramiko @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/paramiko/paramiko-2.12.0
pastel @ file:///Applic.HPC/Easybuild/sources/p/Python/extensions/pastel-0.2.1-py2.py3-none-any.whl
pathlib2 @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pathlib2/pathlib2-2.3.7.post1
pathspec==0.10.2
pbr @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pbr/pbr-5.11.0
pexpect @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pexpect/pexpect-4.8.0
Pillow==9.3.0
pkginfo @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pkginfo/pkginfo-1.8.3
platformdirs==2.5.4
pluggy @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pluggy/pluggy-1.0.0
poetry @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/poetry/poetry-1.2.2
poetry-core @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/poetrycore/poetry-core-1.3.2
poetry-plugin-export @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/poetry_plugin_export/poetry_plugin_export-1.2.0
pooch @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pooch/pooch-1.6.0
psutil @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/psutil/psutil-5.9.4
ptyprocess @ file:///Applic.HPC/Easybuild/sources/p/Python/extensions/ptyprocess-0.7.0-py2.py3-none-any.whl
py @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/py/py-1.11.0
py-expression-eval @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/py_expression_eval/py_expression_eval-0.3.14
pyasn1 @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pyasn1/pyasn1-0.4.8
pycparser @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pycparser/pycparser-2.21
pycryptodome @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pycryptodome/pycryptodome-3.17
pydevtool @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pydevtool/pydevtool-0.3.0
Pygments @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/Pygments/Pygments-2.13.0
pylev @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pylev/pylev-1.4.0
PyNaCl @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/PyNaCl/PyNaCl-1.5.0
pyparsing @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pyparsing/pyparsing-3.0.9
pyrsistent @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pyrsistent/pyrsistent-0.19.2
pytest==7.3.1
pytest-xdist @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pytestxdist/pytest-xdist-3.1.0
python-dateutil @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pythondateutil/python-dateutil-2.8.2
pytoml @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pytoml/pytoml-0.1.21
pytz @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/pytz/pytz-2022.6
regex @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/regex/regex-2022.10.31
requests @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/requests/requests-2.28.1
requests-toolbelt @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/requeststoolbelt/requests-toolbelt-0.9.1
rich @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/rich/rich-13.1.0
rich-click @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/richclick/rich-click-1.6.0
scandir @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/scandir/scandir-1.10.0
scipy==1.9.3
SecretStorage @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/SecretStorage/SecretStorage-3.3.3
semantic-version @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/semantic_version/semantic_version-2.10.0
setuptools-rust @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/setuptoolsrust/setuptools-rust-1.5.2
setuptools-scm @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/setuptools_scm/setuptools_scm-7.0.5
shellingham @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/shellingham/shellingham-1.5.0
simplegeneric @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/simplegeneric/simplegeneric-0.8.1
simplejson @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/simplejson/simplejson-3.17.6
six @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/six/six-1.16.0
snowballstemmer @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/snowballstemmer/snowballstemmer-2.2.0
sortedcontainers @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/sortedcontainers/sortedcontainers-2.4.0
Sphinx @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/Sphinx/Sphinx-5.3.0
sphinx-bootstrap-theme @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/sphinxbootstraptheme/sphinx-bootstrap-theme-0.8.1
sphinxcontrib-applehelp @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/sphinxcontribapplehelp/sphinxcontrib-applehelp-1.0.2
sphinxcontrib-devhelp @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/sphinxcontribdevhelp/sphinxcontrib-devhelp-1.0.2
sphinxcontrib-htmlhelp @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/sphinxcontribhtmlhelp/sphinxcontrib-htmlhelp-2.0.0
sphinxcontrib-jsmath @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/sphinxcontribjsmath/sphinxcontrib-jsmath-1.0.1
sphinxcontrib-qthelp @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/sphinxcontribqthelp/sphinxcontrib-qthelp-1.0.3
sphinxcontrib-serializinghtml @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/sphinxcontribserializinghtml/sphinxcontrib-serializinghtml-1.1.5
sphinxcontrib-websupport @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/sphinxcontribwebsupport/sphinxcontrib-websupport-1.2.4
tables==3.7.0
tabulate @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/tabulate/tabulate-0.9.0
threadpoolctl @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/threadpoolctl/threadpoolctl-3.1.0
toml @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/toml/toml-0.10.2
tomli @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/tomli/tomli-2.0.1
tomli_w @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/tomli_w/tomli_w-1.0.0
tomlkit @ file:///Applic.HPC/Easybuild/sources/p/Python/extensions/tomlkit-0.11.6-py3-none-any.whl
tqdm==4.65.0
typing-extensions==3.7.4.3
tzdata==2023.3
ujson @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/ujson/ujson-5.5.0
urllib3 @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/urllib3/urllib3-1.26.12
virtualenv @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/virtualenv/virtualenv-20.16.6
wcwidth==0.1.7
webencodings @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/webencodings/webencodings-0.5.1
wrapt==1.11.2
xlrd @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/xlrd/xlrd-2.0.1
zipfile36 @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/zipfile36/zipfile36-0.1.3
zipp @ file:///tmp/Python/3.10.8/GCCcore-12.2.0/zipp/zipp-3.10.0
sjteresi commented 3 months ago

@teresi Can I get your assistance here?

This is an output format of pip freeze that I am unfamiliar with. Looks like they have the main TE Density packages with the correct version (H5py, Numpy, and Pandas) but looks like they installed additional packages on top of it (hence the different format and the extra packages). They also said the system test worked fine... Do you have any advice?

@janina-rinke Are you able to create a minimal TE Density install? I'm curious if your extra packages are interfering in some way (the packages with @ and a location suffixed).

janina-rinke commented 3 months ago

Hi! I created a virtual environment and re-installed TE Density. Everything worked perfectly fine and I have all the packages specified in the requirements file within this environment. It is running under Python 3.11.3 now. However, with this python version, I was not able to install tables==3.7.0 and had to instead install tables==3.9.2.

This is the output of my pip freeze command:

DEPRECATION: Loading egg at /TEdensity/lib/python3.11/site-packages/te_density-2.1.1-py3.11.egg is deprecated. pip 24.3 will enforce this behaviour change. A possible replacement is to use pip for package installation.. Discussion can be found at https://github.com/pypa/pip/issues/12330
attrs==22.1.0
black==23.3.0
blosc2==2.5.1
click==8.1.3
coloredlogs==15.0.1
contourpy==1.0.6
cycler==0.11.0
Cython==3.0.8
distlib==0.3.6
exceptiongroup==1.0.4
filelock==3.12.2
fonttools==4.38.0
h5py==3.8.0
humanfriendly==10.0
iniconfig==1.1.1
kiwisolver==1.4.4
matplotlib==3.6.2
msgpack==1.0.7
mypy-extensions==0.4.3
ndindex==1.7
numexpr==2.8.4
numpy==1.24.2
packaging==23.1
pandas==1.5.2
pathspec==0.10.2
Pillow==9.3.0
platformdirs==3.8.0
pluggy==1.0.0
py-cpuinfo==9.0.0
pyparsing==3.0.9
pytest==7.3.1
python-dateutil==2.8.2
pytz==2022.6
scipy==1.9.3
six==1.16.0
tables==3.9.2
te-density==2.1.1
tomli==2.0.1
tqdm==4.65.0
typing-extensions==3.7.4.3
virtualenv==20.23.1
wcwidth==0.1.7
wrapt==1.11.2

Running process_genome.py with one Scaffold only, works perfectly fine and produces the expected output file. However, as soon as I am trying to run TE Density with the complete Gene & TE Annotation, I get this error again:

2024-01-31 16:57:07 r07n19.palma.wwu __main__[14902] INFO preprocessing... complete
2024-01-31 16:57:07 r07n19.palma.wwu __main__[14902] INFO process overlap...
2024-01-31 16:57:07 r07n19.palma.wwu OverlapManager[14902] INFO output overlap data to /TE_density/output/tmp/overlap
process     :  74%|███████████████████▎      | 112/151 [00:05<00:03, 12.28it/s]2024-01-31 16:57:12 r07n19.palma.wwu _ProgressBars[14902] ERROR failed to process gene '/TE_density/output/filtered_input_data/input_cache/GAGA-0515_Scaffold19_GeneData.tsv':  operands could not be broadcast together with shapes (2,) (3274,) 
process     :  85%|██████████████████████    | 128/151 [00:07<00:04,  5.51it/s]2024-01-31 16:57:15 r07n19.palma.wwu _ProgressBars[14902] ERROR failed to process gene '/TE_density/output/filtered_input_data/input_cache/GAGA-0515_Scaffold2_GeneData.tsv':  operands could not be broadcast together with shapes (2,) (8782,)
process     :  99%|█████████████████████████▊| 150/151 [00:15<00:00,  9.59it/s]
genes       :  94%|███████████████████▊ | 10454/11078 [00:15<00:00, 668.52it/s]
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/TE_Density/transposon/overlap_manager.py", line 98, in _process_overlap_job
    result = _calculate_overlap_job(job)

Something seems to be odd for several Scaffolds when processing the overlap data with the overlap_manager.pyscript. Do you have any idea why that could be?

sjteresi commented 3 months ago

Hmm, I am rather at a loss as to why that is happening, especially if the system test is working fine. I wonder if wires are being crossed somewhere and the program is either reading in the genes/TEs wrong or it is trying to calculate TE Density between the wrong scaffolds, or one of the datasets is missing something. I believe I wrote a decent amount of checks to prevent that from happening, but obviously something is wrong.

I'd like to reproduce this and see if I can investigate the input data. Given that, @teresi and I should be able to dig into the overlap_manager.py script better. Are you comfortable sharing your input data, and cleaned files with me? My email is teresisc@msu.edu

sjteresi commented 3 months ago

Hi Janina I have addressed the issue with a bugfix. I was able to fully run the data you sent me over email. You have non-unique gene names in your gene annotation, I have instituted a check to raise an error for users at the beginning of the pipeline.

My apologies, the error should have appeared sooner rather than deep in the pipeline where it raised a cryptic error message.

Please update to v2.1.2