matsengrp / sumrep

Summary statistics for repertoires
16 stars 6 forks source link

Error with partis #33

Closed Pezhvuk closed 5 years ago

Pezhvuk commented 5 years ago

Hi Branden,

I have now had a full run of Sumrep with Partis backend on over 50 files from the DDW and Levin studies with no problem, and I don't see why the remaining files from those studies would be any different. However, any of the files from Gupta et al (2017) study, that I have tried, return the following error:

Traceback (most recent call last):
  File "/d/as7/s/partis/bin/partis", line 450, in <module>
    args.func(args)
  File "/d/as7/s/partis/bin/partis", line 214, in run_partitiondriver
    parter.run(actions)
  File "/d/as7/s/partis/python/partitiondriver.py", line 108, in run
    self.action_fcns[tmpaction]()
  File "/d/as7/s/partis/python/partitiondriver.py", line 272, in cache_parameters
    _, annotations, hmm_failures = self.run_hmm('viterbi', parameter_in_dir=self.sw_param_dir, parameter_out_dir=self.hmm_param_dir, count_parameters=True)
  File "/d/as7/s/partis/python/partitiondriver.py", line 1058, in run_hmm
    self.execute(cmd_str, n_procs)
  File "/d/as7/s/partis/python/partitiondriver.py", line 1029, in execute
    utils.run_cmds(cmdfos, batch_system=self.args.batch_system, batch_options=self.args.batch_options, batch_config_fname=self.args.batch_config_fname, debug='print' if self.args.debug else None)
  File "/d/as7/s/partis/python/utils.py", line 2581, in run_cmds
    finish_process(iproc, procs, n_tries, cmdfos[iproc], n_max_tries, dbgfo=cmdfos[iproc]['dbgfo'], batch_system=batch_system, batch_options=batch_options, debug=debug, ignore_stderr=ignore_stderr, clean_on_success=clean_on_success)
  File "/d/as7/s/partis/python/utils.py", line 2674, in finish_process
    raise Exception(failstr)
Exception: exceeded max number of tries for cmd
    /d/as7/s/partis/packages/ham/bcrham --algorithm viterbi --hmmdir /d/as2/u/mp002/sumrep_project/Gupta/partis/S-GMC_-1h/params/sw/hmms --datadir /tmp/mp002/hmms/303421/germline-sets --infile /tmp/mp002/hmms/303421/hmm-0/hmm_input.csv --outfile /tmp/mp002/hmms/303421/hmm-0/hmm_output.csv --locus igh --random-seed 1554425988 --only-cache-new-vals --ambig-base N
look for output in /tmp/mp002/hmms/303421/hmm-0 and /tmp/mp002/hmms/303421/hmm-0

FYI I updated my Partis last week, and the Sumrep is also up to date. Though, I ran Sumrep with Partis on all the files in the Gupta study with no problem, last year (around March). Something has broken in Partis/sumrep updates?

I have posted this issue on the Partis page as well.

S-GMC_-1h.txt S-FV_-1h.txt

Cheers, Pejvak.

BrandenOlson commented 5 years ago

Hey Peji,

This looks like a partis error, specifically an error in the bcrham subprocess of the partis call. Hopefully Duncan can help you figure out what is going on here.

My first guess would be that the Gupta sequences do not yield a high enough likelihood to train partis' model sufficiently. This can happen for a variety of reasons, for example when sequences are really short, or when the sequences don't match the given locus. But that's probably as much help as I can be with this one -- best wishes and let me know if I can help further.

Pezhvuk commented 5 years ago

Hey Branden,

I posted re the issue on the Partis page, turned out to be a missing package problem that did not cause an error during installation. Why that problem was specific to the Gupta et al sequences, I haven't the foggiest!

BrandenOlson commented 5 years ago

Very strange, but glad you got it working!