See this issue with Metalign. In short,
So I figured out the probably source of the program hanging: python3 vs python2 for CMash:
After setting up the data, this hangs:
also works (as it appears metalign.py is python2/3 compliant):
python metalign.py test/RL_S001insert_270_1M_subset.fq data/ --output test/RL_S001insert_270_1M_subset_results.tsv
So possible solutions (with my assessment of ease of implementation):
Make setup_libraries.sh use python2 when installing CMash (probably via a virtualenv) (easy)
See why marisa-trie isn't working with python3 (their repo says it's python3 compatible) (medium)
Refactor CMash so it's python3 compliant (hard)
See this issue with Metalign. In short, So I figured out the probably source of the program hanging: python3 vs python2 for CMash: After setting up the data, this hangs:
python3 -m venv VE3 source VE3/bin/activate .\setup_libraries.sh
This hangs:
python3 metalign.py test/RL_S001insert_270_1M_subset.fq data/ --output test/RL_S001insert_270_1M_subset_results.tsv
this is the thing causing the hang:
python3 StreamingQueryDNADatabase.py ../../data/r7aqo9zw/60mers_intersection_dump.fa ../../data/cmash_db_n1000_k60.h5 ../../test/CMash_out.csv 30-60-10 -c 0 -r 10000 -v -f ../../data/cmash_filter_n1000_k60_30-60-10.bf --sensitive
So instead, try python2, and it doesn't hang:
virtualenv VE2 source VE2/bin/activate cd CMash pip install -r requirements.txt
this runs just fine and does not hang:
python StreamingQueryDNADatabase.py ../../data/r7aqo9zw/60mers_intersection_dump.fa ../../data/cmash_db_n1000_k60.h5 ../../test/CMash_out.csv 30-60-10 -c 0 -r 10000 -v -f ../../data/cmash_filter_n1000_k60_30-60-10.bf --sensitive
this works too (oddly enough, since it's being called with python3, so it only looks like installing CMash with python2 is required):
python3 metalign.py test/RL_S001insert_270_1M_subset.fq data/ --output test/RL_S001insert_270_1M_subset_results.tsv
also works (as it appears metalign.py is python2/3 compliant):
python metalign.py test/RL_S001insert_270_1M_subset.fq data/ --output test/RL_S001insert_270_1M_subset_results.tsv
So possible solutions (with my assessment of ease of implementation):
Make setup_libraries.sh use python2 when installing CMash (probably via a virtualenv) (easy) See why marisa-trie isn't working with python3 (their repo says it's python3 compatible) (medium) Refactor CMash so it's python3 compliant (hard)