kfuku52 / csubst

Molecular convergence detection
BSD 3-Clause "New" or "Revised" License
25 stars 1 forks source link

struct.error: 'i' format requires -2147483648 <= number <= 2147483647 #13

Closed kfuku52 closed 2 years ago

kfuku52 commented 3 years ago

This error occurred in a large dataset ("mitogenome"). --float_type 32 was a work-around.

csubst analyze --max_arity 2 --threads 8 --alignment_file alignment.fa --rooted_tree_file tree.nwk --foreground foreground.txt --force_exhaustive yes --ml_anc no --fg_stem_only yes --omega_method modelfree --asrv each --iqtree_model GY+F+R4 --genetic_code 2 --iqtree_redo no --float_type 64
CSUBST start: 2021-03-16 09:03:43.772250+00:00
csubst analyze start: 2021-03-16 09:03:43.776029+00:00
Reading and parsing input files.
Using internal node names and branch lengths in --iqtree_treefile and the root position in --rooted_tree_file.
Total branch length of --rooted_tree_file: 386.6432154395002
Total branch length of --iqtree_treefile: 398.2955882267

IQ-TREE's intermediate files exist.
Reading the state file: alignment.fa.state

Writing alignment: csubst_alignment_codon.fa
Writing alignment: csubst_alignment_aa.fa
Ancestral states were not estimated on the root node. Excluding sub-root nodes from the analysis.
Memory map is generated. dtype=float64, axis=(147, 3758, 1, 20, 20), path=/Users/kef74yk/Dropbox (Personal)/repos/csubst/data/mitogenome/tmp.csubst.sub_tensor.N.mmap
Memory map is generated. dtype=float64, axis=(147, 3758, 20, 6, 6), path=/Users/kef74yk/Dropbox (Personal)/repos/csubst/data/mitogenome/tmp.csubst.sub_tensor.S.mmap
Branch lengths of the IQ-TREE output are rescaled to match observed-codon-substitutions/codon-site, rather than nucleotide-substitutions/codon-site.
Total branch length before rescaling: 398.296 nucleotide substitutions / codon site
Total S+N branch length after rescaling: 64.164 codon substitutions / codon site
Total S branch length after rescaling: 52.031 codon substitutions / codon site
Total N branch length after rescaling: 12.133 codon substitutions / codon site
Synonymous substitutions / tree = 195,239.3
Nonsynonymous substitutions / tree = 45,500.6
Synonymous substitutions / branch = 1,337.3
Nonsynonymous substitutions / branch = 311.6
Synonymous substitutions / site = 52.0
Nonsynonymous substitutions / site = 12.1
Elapsed time: 27.0 sec

Generating site table.
Memory consumption of s table: 0.1 Mbytes (dtype=float64)
Elapsed time: 10.0 sec

Generating branch table.
Number of S_sub patterns among 147 branches=145, min=0.0, max=1,650.5355610055
Number of N_sub patterns among 147 branches=145, min=0.0, max=984.1845200000001
Memory consumption of b table: 0.0 Mbytes (dtype=object)
Elapsed time: 0.0 sec

Generating combinat-branch table. Arity = 2
Exhaustively searching independent branch combinations.
Arity: 2
All nodes: 146
all target nodes: 146
all node combinations: 10,585
removing 1,055 dependent branch combinations.
detected 74 (out of 9,530) foreground branch combinations to be treated as non-foreground.
independent node combinations: 9,530
Preparing the cbOS table with 8 thread(s).
Traceback (most recent call last):
  File "/Users/kef74yk/Dropbox_p/repos/csubst/csubst/csubst", line 275, in <module>
    args.handler(args)
  File "/Users/kef74yk/Dropbox_p/repos/csubst/csubst/csubst", line 47, in command_analyze
    main_analyze(g)
  File "/Users/kef74yk/Dropbox (Personal)/repos/csubst/csubst/main_analyze.py", line 262, in main_analyze
    g = cb_search(g, b, S_tensor, N_tensor, id_combinations, mode='foreground', write_cb=True)
  File "/Users/kef74yk/Dropbox (Personal)/repos/csubst/csubst/main_analyze.py", line 99, in cb_search
    cbS = substitution.get_cb(id_combinations, S_tensor, g, 'S')
  File "/Users/kef74yk/Dropbox (Personal)/repos/csubst/csubst/substitution.py", line 172, in get_cb
    (ids, sub_tensor, True, df_mmap, ms, g['float_type']) for ids,ms in zip(id_chunks, mmap_starts)
  File "/Users/kef74yk/anaconda3/lib/python3.7/site-packages/joblib/parallel.py", line 1061, in __call__
    self.retrieve()
  File "/Users/kef74yk/anaconda3/lib/python3.7/site-packages/joblib/parallel.py", line 940, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/Users/kef74yk/anaconda3/lib/python3.7/multiprocessing/pool.py", line 657, in get
    raise self._value
  File "/Users/kef74yk/anaconda3/lib/python3.7/multiprocessing/pool.py", line 431, in _handle_tasks
    put(task)
  File "/Users/kef74yk/anaconda3/lib/python3.7/site-packages/joblib/pool.py", line 157, in send
    self._writer.send_bytes(buffer.getvalue())
  File "/Users/kef74yk/anaconda3/lib/python3.7/multiprocessing/connection.py", line 200, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/Users/kef74yk/anaconda3/lib/python3.7/multiprocessing/connection.py", line 393, in _send_bytes
    header = struct.pack("!i", n)
struct.error: 'i' format requires -2147483648 <= number <= 2147483647
kfuku52 commented 3 years ago

Reproduced with the prestin dataset. --float_type 32 worked.