bcgsc / goldrush

Linear-time de novo Long Read Assembler
GNU General Public License v3.0
34 stars 2 forks source link

Debian Installation issues - memory free issue on test dataset #120

Closed gringer closed 11 months ago

gringer commented 1 year ago

To get this working on my Debian system, I needed to directly insert calc_phred_avg from the btllib code into the goldrush-edit repository [I notice that you have code in goldrush with a similar function name], because my Debian system had an earlier version of btllib. Unfortunately, once I got that working and compiling, I had a memory free error:

gringer@musculus:~/install/goldrush-git/tests$ ./goldrush_test_demo.sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 97.4M  100 97.4M    0     0  1705k      0  0:00:58  0:00:58 --:--:-- 1745k
Launching GoldRush
goldrush-path  -k 22 -w 16 -t 1000 -u 5 -a 1 -o 0.1 -p goldrush_test_silver_path -i test_reads.fq  -h 3 -j 4 -x10 -P 15 -d 5 -s 1011011110110111101101 -g 1e6 -b 10 -r 0.9 --silver_path -M 5 -m 20000
Using preset spaced seed
with:   
        span: 22
        weight: 16
Calculating 5 silver path(s)
Using:  
        tile length: 1000
        block size: 10
        seed patterns: 3
        threshold: 10
        base seed pattern: 1011011110110111101101
        minimum unassigned tiles: 5
        maximum assigned tiles: 1
        expected hash space: 3000000
        minimum average phred quality score: 15
        maximum average phred delta between first and second half of read: 5
        occupancy: 0.1
        jobs: 4
allocating bit vector
m_filterSize: 28473728
finished allocating bit vector
in 0.0012
opening: test_reads.fq
inserting bit vector
finished inserting bit vector
in 2.2366
src/tcmalloc.cc:333] Attempt to free invalid pointer 0x7fff774e4c60
make: *** [/usr/local/bin/goldrush:226: goldrush_test_silver_path_5.fq] Aborted (core dumped)

My computer has 64 GB main memory, and 800 GB SSD swap, so it shouldn't be having memory issues for this.

gringer commented 1 year ago

Backtrace from gdb:

Thread 1 "goldrush-path" received signal SIGABRT, Aborted.
0x00007ffff78eb0fc in ?? () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007ffff78eb0fc in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff789d472 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff78874b2 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007ffff7c16d22 in ?? () from /lib/x86_64-linux-gnu/libtcmalloc.so.4
#4  0x00007ffff7c18051 in ?? () from /lib/x86_64-linux-gnu/libtcmalloc.so.4
#5  0x00005555555aedc0 in btllib::SeqReader::~SeqReader() ()
#6  0x0000555555572c25 in fill_bit_vector (input_file=..., miBFCS=..., min_seq_len=20000, spaced_seeds=..., filter_out_reads=...) at ../goldrush_path/goldrush_path.cpp:188
#7  0x0000555555575d8b in main (argc=39, argv=0x7fffffffd928) at ../goldrush_path/goldrush_path.cpp:1015

This seems to be happening at the end of the fill_bit_vector function:

   184
   185    std::cerr << "finished inserting bit vector" << std::endl;
   186    std::cerr << "in " << setprecision(4) << fixed << omp_get_wtime() - sTime
   187              << "\n";
   188  }
   189  

I wonder if this is an issue with me using an older version of btllib (i.e. the one available in my Debian distribution, v1.4.10).

gringer commented 1 year ago

I installed the newest release of btllib, following the compile instructions here, using --prefix /usr/local.

I also added #include <cstdint> in the places where the goldrush compile errors suggested.

This got past the memory allocation issue, but I'm now having trouble with the python library:

goldrush-ulimit goldrush-edit --minimap2 -t4 -m/dev/shm goldrush_test_golden_path.fa test_reads.fq goldrush_test_golden_path.goldrush-edit-polished.fa
Running with 256112 max processes: goldrush-edit --minimap2 -t4 -m/dev/shm goldrush_test_golden_path.fa test_reads.fq goldrush_test_golden_path.goldrush-edit-polished.fa
Traceback (most recent call last):
  File "/usr/local/bin/goldrush-edit", line 23, in <module>
    import btllib
ModuleNotFoundError: No module named 'btllib'
make: *** [/usr/local/bin/goldrush:233: goldrush_test_golden_path.goldrush-edit-polished.fa] Error 1

The recommended approach didn't work for me:

$ python3 -m pip install /home/gringer/install/btllib/btllib-1.6.2/install/lib/btllib/python
error: externally-managed-environment

× This environment is externally managed
╰─> To install Python packages system-wide, try apt install
    python3-xyz, where xyz is the package you are trying to
    install.

    If you wish to install a non-Debian-packaged Python package,
    create a virtual environment using python3 -m venv path/to/venv.
    Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make
    sure you have python3-full installed.

    If you wish to install a non-Debian packaged Python application,
    it may be easiest to use pipx install xyz, which will manage a
    virtual environment for you. Make sure you have pipx installed.

    See /usr/share/doc/python3.11/README.venv for more information.

note: If you believe this is a mistake, please contact your Python installation or OS distribution provider. You can override this, at the risk of breaking your Python installation or OS, by passing --break-system-packages.
hint: See PEP 668 for the detailed specification.

Because it's a library, pipx didn't work either:

$ pipx install /home/gringer/install/btllib/btllib-1.6.2/install/lib/btllib/python

No apps associated with package btllib or its dependencies. If you are attempting to install a library, pipx should not be used. Consider using pip or a similar tool instead.

As a quick hack (because Debian, and I don't really understand how python's packaging works), I installed the python library into /usr/local/bin:

sudo python3 -m pip install -t /usr/local/bin /home/gringer/install/btllib/btllib-1.6.2/install/lib/btllib/python

... and now it's complaining about Tigmint (which I notice is mentioned in the goldrush dependencies). So... progress!

gringer commented 1 year ago

Next non-obvious error (after installing tigmint manually into /usr/local/bin, and installing python3-pysam, and python3-pybedtools, and python3-intervaltree, and python3-igraph) was this one:

Traceback (most recent call last):
  File "/usr/local/bin/tigmint_molecule_paf.py", line 141, in <module>
    main()
  File "/usr/local/bin/tigmint_molecule_paf.py", line 138, in main
    MolecIdentifierPaf().run()
  File "/usr/local/bin/tigmint_molecule_paf.py", line 98, in run
    self.print_new_molecule(prev_barcode, cur_intervals, out_molecules_file)
  File "/usr/local/bin/tigmint_molecule_paf.py", line 43, in print_new_molecule
    barcode_match = re.search(r'^BX:Z:(\S+)', barcode)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/re/__init__.py", line 176, in search
    return _compile(pattern, flags).search(string)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: expected string or bytes-like object, got 'NoneType'

This initially doesn't seem to affect the running of the program; it continues along despite this error.

The program finishes with another error (and I notice now that ntlink is also mentioned in the dependencies):

make: ntLink_rounds: No such file or directory
make: *** [/usr/local/bin/goldrush:259: goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa.k40.w250.z1000.ntLink.gap_fill.5rounds.fa] Error 127
lcoombe commented 1 year ago

Hi @gringer,

Thanks for the reports. I'm assuming that you were using the latest commit of GoldRush (vs. a release?). If so, yes, you do need a later version of btllib, as you found.

For installations, when possible I highly recommend installing via conda (https://anaconda.org/bioconda/goldrush). GoldRush is available on conda, and this would help to solve/bypass many of the issues that you are seeing. All of the dependencies listed in the README are required and can also be installed with conda.

I think we should address the installations issues one-by-one. Since you found workarounds for the steps prior to Tigmint, let's start at Tigmint. That error that you are seeing would be expected to impact your results. Can you please include your full log (at least from the Tigmint point onwards)? Have you installed all the Tigmint dependencies (as well as the ones you listed)? We also have tests over at the Tigmint repository (https://github.com/bcgsc/tigmint/tree/master/tests) that can help to test your installation.

Thanks for your interest in GoldRush, Lauren

gringer commented 1 year ago

Okay, thanks. Here's the log of my run from downloaded goldrush version 1.0.3:

gringer@musculus:~/install/goldrush/goldrush-1.0.3/tests$ ./goldrush_test_demo.sh
--2023-09-07 09:59:12--  https://www.bcgsc.ca/downloads/btl/goldrush/test/test_reads.fq
Resolving www.bcgsc.ca (www.bcgsc.ca)... 134.87.4.82, 134.87.4.81
Connecting to www.bcgsc.ca (www.bcgsc.ca)|134.87.4.82|:443... connected.
HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable

    The file is already fully retrieved; nothing to do.   

Launching GoldRush
goldrush-path  -k 22 -w 16 -t 1000 -u 5 -a 1 -o 0.1 -p goldrush_test_silver_path -i test_reads.fq  -h 3 -j 4 -x10 -P 15 -d 5 -s 1011011110110111101101 -g 1e6 -b 10 -r 0.9 --silver
_path -M 5 -m 20000
Using preset spaced seed
with:
        span: 22
        weight: 16
Calculating 5 silver path(s)
Using:
        tile length: 1000
        block size: 10
        seed patterns: 3
        threshold: 10
        base seed pattern: 1011011110110111101101
        minimum unassigned tiles: 5
        maximum assigned tiles: 1
        expected hash space: 3000000
        minimum average phred quality score: 15
        maximum average phred delta between first and second half of read: 5
        occupancy: 0.1
        jobs: 4
allocating bit vector
m_filterSize: 28473728
finished allocating bit vector
in 0.0020
opening: test_reads.fq
inserting bit vector
finished inserting bit vector
in 4.4033
assigning tiles
assigned
in 10.2727
cat goldrush_test_silver_path_*.fq > goldrush_test_silver_path_all.fq
goldrush-path  -k 22 -w 16 -t 1000 -u 5 -a 1 -o 0.1 -p goldrush_test_golden_path -i goldrush_test_silver_path_all.fq -h 3 -j 4 -P 15 -d 5 -x10 -s 1011011110110111101101 -g 1e6 -b 10  -m 0
Using preset spaced seed
with:
        span: 22
        weight: 16
Calculating the golden path
Using:
        tile length: 1000
        block size: 10
        seed patterns: 3
        threshold: 10
        base seed pattern: 1011011110110111101101
        minimum unassigned tiles: 5
        maximum assigned tiles: 1
        expected hash space: 3000000
        minimum average phred quality score: 15
        maximum average phred delta between first and second half of read: 5
        occupancy: 0.1
        jobs: 4
allocating bit vector
m_filterSize: 28473728
finished allocating bit vector
in 0.0014
opening: goldrush_test_silver_path_all.fq
inserting bit vector
finished inserting bit vector
in 0.2921
assigning tiles
assigned
in 0.5698
echo "Done GoldRush-Path! Golden path can be found in: goldrush_test_golden_path.fa"
Done GoldRush-Path! Golden path can be found in: goldrush_test_golden_path.fa
goldrush-ulimit goldrush-edit --minimap2 -t4 -m/dev/shm goldrush_test_golden_path.fa test_reads.fq goldrush_test_golden_path.goldrush-edit-polished.fa
Running with 256112 max processes: goldrush-edit --minimap2 -t4 -m/dev/shm goldrush_test_golden_path.fa test_reads.fq goldrush_test_golden_path.goldrush-edit-polished.fa
[2023-09-07 09:59:29][INFO] Building indexes and mappings...
[2023-09-07 09:59:30][INFO] make[1]: Entering directory '/home/gringer/install/goldrush/goldrush-1.0.3/tests'
goldrush-edit-index goldrush_test_golden_path.fa goldrush_test_golden_path.fa.index
minimap2 -t4 /home/gringer/install/goldrush/goldrush-1.0.3/tests/goldrush_test_golden_path.fa /home/gringer/install/goldrush/goldrush-1.0.3/tests/test_reads.fq >goldrush_test_golden_path.fa.test_reads.fq.paf
goldrush-edit-index test_reads.fq test_reads.fq.index
make[1]: Leaving directory '/home/gringer/install/goldrush/goldrush-1.0.3/tests'
[2023-09-07 09:59:29][INFO] SeqIndex::SeqIndex: Building index for goldrush_test_golden_path.fa...
[2023-09-07 09:59:29][INFO] SeqIndex::SeqIndex: Done.
[2023-09-07 09:59:29][INFO] SeqIndex::save: Saving index to goldrush_test_golden_path.fa.index...
[2023-09-07 09:59:29][INFO] SeqIndex::save: Done.
[M::mm_idx_gen::0.018*1.04] collected minimizers
[M::mm_idx_gen::0.023*1.62] sorted minimizers
[M::main::0.023*1.62] loaded/built the index for 141 target sequence(s)
[M::mm_mapopt_update::0.026*1.55] mid_occ = 10
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 141
[M::mm_idx_stat::0.027*1.53] distinct minimizers: 198345 (96.71% are singletons); average occurrences: 1.037; average spacing: 5.342; total length: 1098547
[M::worker_pipeline::0.775*3.60] mapped 3795 sequences
[M::main] Version: 2.26-r1175
[M::main] CMD: minimap2 -t4 /home/gringer/install/goldrush/goldrush-1.0.3/tests/goldrush_test_golden_path.fa /home/gringer/install/goldrush/goldrush-1.0.3/tests/test_reads.fq
[M::main] Real time: 0.778 sec; CPU: 2.793 sec; Peak RSS: 0.068 GB
[2023-09-07 09:59:30][INFO] SeqIndex::SeqIndex: Building index for test_reads.fq...
[2023-09-07 09:59:30][INFO] SeqIndex::SeqIndex: Done.
[2023-09-07 09:59:30][INFO] SeqIndex::save: Saving index to test_reads.fq.index...
[2023-09-07 09:59:30][INFO] SeqIndex::save: Done.

[2023-09-07 09:59:30][INFO] Indexes and mappings built.   
[2023-09-07 09:59:30][INFO] Subsampling mapped reads to 40
[2023-09-07 09:59:30][INFO] SeqIndex::SeqIndex: Loading index from /home/gringer/install/goldrush/goldrush-1.0.3/tests/goldrush_test_golden_path.fa.index...
[2023-09-07 09:59:30][INFO] SeqIndex::SeqIndex: Done!
[2023-09-07 09:59:30][INFO] SeqIndex::SeqIndex: Loading index from /home/gringer/install/goldrush/goldrush-1.0.3/tests/test_reads.fq.index...
[2023-09-07 09:59:30][INFO] SeqIndex::SeqIndex: Done!
[2023-09-07 09:59:30][INFO] AllMappings::load_paf: Loading PAF mappings from /home/gringer/install/goldrush/goldrush-1.0.3/tests/goldrush_test_golden_path.fa.test_reads.fq.paf...
[2023-09-07 09:59:30][INFO] AllMappings::load_paf: Done!  
[2023-09-07 09:59:30][INFO] serve: Accepting batch names at batch_name_input
[2023-09-07 09:59:32][INFO] goldrush-edit-targeted-bfs is ready!
[2023-09-07 09:59:32][INFO] Polishing batches...
[2023-09-07 10:00:31][INFO] Done polishing batches, ending BF builder process...                                                                                                   
[2023-09-07 10:00:31][INFO] serve: Targeted BF builder done!
[2023-09-07 10:00:31][INFO] Polisher done
echo "Done GoldRush-Path + GoldRush-Edit! GoldRush-Edit polished golden path can be found in: goldrush_test_golden_path.goldrush-edit-polished.fa"
Done GoldRush-Path + GoldRush-Edit! GoldRush-Edit polished golden path can be found in: goldrush_test_golden_path.goldrush-edit-polished.fa
tigmint-make tigmint-long draft=goldrush_test_golden_path.goldrush-edit-polished reads=test_reads cut=250 t=4 G=1e6 span=2 dist=500
make[1]: Entering directory '/home/gringer/install/goldrush/goldrush-1.0.3/tests'
/usr/local/bin/tigmint_estimate_dist.py test_reads.fq -n 1000000 -o test_reads.tigmint-long.params.tsv
sh -c '/usr/local/bin/../src/long-to-linked-pe -l 250 -m2000 -g1e6 -s -b test_reads.barcode-multiplicity.tsv --bx -t4 --fasta -f test_reads.tigmint-long.params.tsv test_reads.fq | \
minimap2 -y -t4 -x map-ont --secondary=no goldrush_test_golden_path.goldrush-edit-polished.fa - | \
/usr/local/bin/tigmint_molecule_paf.py -q0 -s2000 -d500 - | sort -k1,1 -k2,2n -k3,3n  > goldrush_test_golden_path.goldrush-edit-polished.test_reads.cut250.molecule.size2000.dist500.bed'
sh: 1: /usr/local/bin/../src/long-to-linked-pe: not found 
[M::mm_idx_gen::0.018*1.04] collected minimizers
[M::mm_idx_gen::0.024*1.63] sorted minimizers
[M::main::0.024*1.63] loaded/built the index for 141 target sequence(s)
[M::mm_mapopt_update::0.026*1.56] mid_occ = 10
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 141
[M::mm_idx_stat::0.028*1.54] distinct minimizers: 195765 (95.54% are singletons); average occurrences: 1.050; average spacing: 5.348; total length: 1099318
[M::main] Version: 2.26-r1175
[M::main] CMD: minimap2 -y -t4 -x map-ont --secondary=no goldrush_test_golden_path.goldrush-edit-polished.fa -
[M::main] Real time: 0.030 sec; CPU: 0.045 sec; Peak RSS: 0.013 GB
Traceback (most recent call last):
  File "/usr/local/bin/tigmint_molecule_paf.py", line 141, in <module>
    main()
  File "/usr/local/bin/tigmint_molecule_paf.py", line 138, in main
    MolecIdentifierPaf().run()
  File "/usr/local/bin/tigmint_molecule_paf.py", line 98, in run
    self.print_new_molecule(prev_barcode, cur_intervals, out_molecules_file)
  File "/usr/local/bin/tigmint_molecule_paf.py", line 43, in print_new_molecule
    barcode_match = re.search(r'^BX:Z:(\S+)', barcode)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/re/__init__.py", line 176, in search
    return _compile(pattern, flags).search(string)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: expected string or bytes-like object, got 'NoneType'
samtools faidx goldrush_test_golden_path.goldrush-edit-polished.fa
/usr/local/bin/tigmint-cut -p4 -w1000 -n2 -t0 -m3000 -o goldrush_test_golden_path.goldrush-edit-polished.test_reads.cut250.molecule.size2000.dist500.trim0.window1000.span2.breaktigs.fa goldrush_test_golden_path.goldrush-edit-polished.fa goldrush_test_golden_path.goldrush-edit-polished.test_reads.cut250.molecule.size2000.dist500.bed
Started at: 2023-09-07 10:00:31.650698
Reading contig lengths...
Finding breakpoints...
Attempted corrections: 0
Cutting assembly at breakpoints...
DONE!
Ended at: 2023-09-07 10:00:31.668287
ln -sf goldrush_test_golden_path.goldrush-edit-polished.test_reads.cut250.molecule.size2000.dist500.trim0.window1000.span2.breaktigs.fa goldrush_test_golden_path.goldrush-edit-polished.cut250.tigmint.fa
make[1]: Leaving directory '/home/gringer/install/goldrush/goldrush-1.0.3/tests'
ln -sf goldrush_test_golden_path.goldrush-edit-polished.cut250.tigmint.fa goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa
echo "Done GoldRush-Path + GoldRush-Edit + Tigmint-long! Post-Tigmint-long golden path can be found in: goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa"
Done GoldRush-Path + GoldRush-Edit + Tigmint-long! Post-Tigmint-long golden path can be found in: goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa
rm -f goldrush_test_silver_path_*.fq
Clean Done
ntLink_rounds run_rounds_gaps target=goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa t=4 k=40 w=250 z=1000 rounds=5 reads=test_reads.fq
make[1]: Entering directory '/home/gringer/install/goldrush/goldrush-1.0.3/tests'
ntLink scaffold gap_fill target=goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa reads=test_reads.fq k=40 w=250 z=1000
make[2]: Entering directory '/home/gringer/install/goldrush/goldrush-1.0.3/tests'
indexlr --long --pos --strand -k 40 -w 250 -t 4 goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa > goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa.k40.w250.tsv
sh -c 'gzip -f -cd test_reads.fq | \
indexlr --long --pos --strand --len -k 40 -w 250 -t 4 - | \
/usr/local/bin/bin/ntlink_pair.py -p goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa.k40.w250.z1000 -n 1 -m goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa.k40.w250.tsv -s goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa  \
-k 40 -a 1 -z 1000 -f 10 -x 0 --verbose -'
Running pairing stage of ntLink ...

Parameters:
        Read minimizer files:  ['-']
        -s  goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa
        -m  goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa.k40.w250.tsv
        -p  goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa.k40.w250.z1000
        -n  1
        -k  40
        -a  1
        -z  1000
        -f  10
        -x  0.0
        -c  goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa.k40.w250.z1000.verbose_mapping.tsv
Found checkpoint file, bypassing read mapping...

2023-09-07 10:00:31.847989 : Reading fasta file goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa
2023-09-07 10:00:31.848738 : Finding pairs
Traceback (most recent call last):
  File "/usr/local/bin/bin/ntlink_pair.py", line 584, in main
    pairs = self.find_scaffold_pairs_checkpoints()
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/bin/bin/ntlink_pair.py", line 448, in find_scaffold_pairs_checkpoints
    self.parse_verbose_entries(curr_mappings, pairs)
  File "/usr/local/bin/bin/ntlink_pair.py", line 486, in parse_verbose_entries
    self.tally_pairs_from_mappings(accepted_anchor_contigs, contig_runs, length_read, pairs)
  File "/usr/local/bin/bin/ntlink_pair.py", line 420, in tally_pairs_from_mappings
    self.add_pair(accepted_anchor_contigs, ctg_i, ctg_j, pairs, length_long_read)
  File "/usr/local/bin/bin/ntlink_pair.py", line 317, in add_pair
    pair, gap_est = self.calculate_pair_info(MinimizerEdge(mx_i.mx_hash, mx_i.position, mx_i.strand,
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/bin/bin/ntlink_pair.py", line 235, in calculate_pair_info
    gap_estimate = self.calculate_gap_size(mx_edge.mx_i, source_ori, mx_edge.mx_j, target_ori,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/bin/bin/ntlink_pair.py", line 163, in calculate_gap_size
    u_ctglen = NtLink.scaffolds[u_ctg].length
               ~~~~~~~~~~~~~~~~^^^^^^^
KeyError: 'SRR10028109.3103_trimmed_untrimmed::SRR10028109.3103_trimmed_untrimmed:0-5998'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/bin/ntlink_pair.py", line 622, in <module>
    main()
  File "/usr/local/bin/bin/ntlink_pair.py", line 619, in main
    NtLink().main()
  File "/usr/local/bin/bin/ntlink_pair.py", line 611, in main
    raise NtlinkPairError("ntLink pairing stage encountered an error..")
NtlinkPairError: ntLink pairing stage encountered an error..
make[2]: *** [/usr/local/bin/ntLink:212: goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa.k40.w250.z1000.n1.scaffold.dot] Error 1
make[2]: Leaving directory '/home/gringer/install/goldrush/goldrush-1.0.3/tests'
make[1]: *** [/usr/local/bin/ntLink_rounds:96: goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa.k40.w250.z1000.ntLink.gap_fill.fa] Error 2
make[1]: Leaving directory '/home/gringer/install/goldrush/goldrush-1.0.3/tests'
make: *** [/usr/local/bin/goldrush:259: goldrush_test_golden_path.goldrush-edit-polished.span2.dist500.tigmint.fa.k40.w250.z1000.ntLink.gap_fill.5rounds.fa] Error 2

I found information about how to install conda on my Debian system here:

https://docs.conda.io/projects/conda/en/latest/user-guide/install/rpm-debian.html

Unfortunately, goldrush is not installing properly with that method:

gringer@musculus:~/install/goldrush-git$ source /opt/conda/etc/profile.d/conda.sh
gringer@musculus:~/install/goldrush-git$ conda -V
conda 23.7.3
gringer@musculus:~/install/goldrush-git$ conda install -c bioconda goldrush
Retrieving notices: ...working... done
Collecting package metadata (current_repodata.json): done
Solving environment: unsuccessful initial attempt using frozen solve. Retrying with flexible solve.
Solving environment: unsuccessful attempt using repodata from current_repodata.json, retrying with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: \ 
Found conflicts! Looking for incompatible packages.                                                                                                                              failed

UnsatisfiableError: The following specifications were found
to be incompatible with the existing python installation in your environment:

Specifications:

  - goldrush -> python[version='>=3.8,<3.9.0a0|>=3.9,<3.10.0a0']

Your python: python=3.10

If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.

The following specifications were found to be incompatible with your system:

  - feature:/linux-64::__glibc==2.37=0
  - feature:|@/linux-64::__glibc==2.37=0
  - goldrush -> gperftools -> __glibc[version='>=2.17,<3.0.a0']

Your installed version is: 2.37
lcoombe commented 1 year ago

Hi @gringer,

It looks like there is still an issue with your Tigmint installation - I see this error:

sh -c '/usr/local/bin/../src/long-to-linked-pe -l 250 -m2000 -g1e6 -s -b test_reads.barcode-multiplicity.tsv --bx -t4 --fasta -f test_reads.tigmint-long.params.tsv test_reads.fq | \
minimap2 -y -t4 -x map-ont --secondary=no goldrush_test_golden_path.goldrush-edit-polished.fa - | \
/usr/local/bin/tigmint_molecule_paf.py -q0 -s2000 -d500 - | sort -k1,1 -k2,2n -k3,3n  > goldrush_test_golden_path.goldrush-edit-polished.test_reads.cut250.molecule.size2000.dist500.bed'
sh: 1: /usr/local/bin/../src/long-to-linked-pe: not found 

So, one of the scripts required by Tigmint cannot be found.

For conda, are you installing GoldRush is a fresh conda environment? Highly recommend doing that when you see that there are issues with solving the environment. If it is a fresh environment, you can try mamba (conda install -c conda-forge mamba, then mamba install -c conda-forge -c bioconda goldrush) - this just installs conda packages more quickly.

gringer commented 1 year ago

Installing via conda doesn't work, unfortunately:

root@musculus:~# mamba install -c conda-forge -c bioconda goldrush

Looking for: ['goldrush']                                                                                                                                                                            

bioconda/linux-64 (check zst)                       Checked  0.4s                                                                                                                                    
bioconda/noarch (check zst)                         Checked  0.3s                                                                                                                                    
pkgs/main/linux-64 (check zst)                      Checked  0.4s                                                                                                                                    
pkgs/main/noarch (check zst)                       Checked  0.0s
pkgs/r/linux-64 (check zst)                         Checked  0.4s                                                                                                                                    
pkgs/r/noarch (check zst)                           Checked  0.3s                                                                                                                                    
pkgs/main/linux-64                                   5.3MB @   6.5MB/s  0.9s                                                                                                                         
conda-forge/noarch                                  12.3MB @  12.9MB/s  1.2s                                                                                                                         
pkgs/main/noarch                                   696.9kB @ 571.6kB/s  0.3s                                                                                                                         
pkgs/r/linux-64                                      1.2MB @ 881.4kB/s  0.4s                                                                                                                         
bioconda/noarch                                      4.7MB @   3.1MB/s  1.5s                                                                                                                         
pkgs/r/noarch                                        1.3MB @ 830.6kB/s  0.3s                                                                                                                         
bioconda/linux-64                                    5.1MB @   3.3MB/s  1.7s                                                                                                                         
conda-forge/linux-64                                30.1MB @  10.2MB/s  3.6s                                                                                                                         

Pinned packages:                                                                                                                                                                                     
  - python 3.10.*                                                                                                                                                                                    

warning  libmamba Added empty dependency for problem type SOLVER_RULE_UPDATE                                                                                                                         
Could not solve for environment specs                                                                                                                                                                
The following packages are incompatible                                                                                                                                                              
└─ goldrush is installable with the potential options                                                                                                                                                
   ├─ goldrush [0.9.3|1.0.0|1.0.1|1.0.2|1.0.3] would require                                                                                                                                         
   │  └─ python >=3.8,<3.9.0a0 , which can be installed;                                                                                                                                             
   └─ goldrush [0.9.3|1.0.0|1.0.1|1.0.2|1.0.3] would require                                                                                                                                         
      └─ python >=3.9,<3.10.0a0 , which can be installed.                                                                                                                                            
lcoombe commented 1 year ago

Hi @gringer, You are in an environment that at least has mamba/python installed - Because a couple of goldrush dependencies require it, try downgrading your python prior to the installation. So, for example, if you are installing mamba in a fresh env:

conda install -c conda-forge mamba python=3.9

then

mamba install -c conda-forge -c bioconda goldrush
github-actions[bot] commented 11 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your interest in GoldRush!