PoonLab / vindels

Developing an empirical model of sequence insertion and deletion in virus genomes
1 stars 0 forks source link

GeoPIP works on test files, but not my own #62

Closed jpalmer37 closed 5 years ago

jpalmer37 commented 5 years ago

I got GeoPIP running using the changes in your geoPIP issue. I was able to complete geopip_test.py and performed data_analysis.py on their test data molluscan.fastasuccessfully with no errors.

However, when I change the input file of data_analysis.py to my own FASTA file, I get one of the exact same error messages from your geoPIP thread:

### updating rate matrix Q ###

Exception in thread "main" java.lang.RuntimeException: ma.newick.ParseException: Encountered " "(" "( "" at line 1, column 2.
Was expecting:
    ")" ...

    at pty.RootedTree$Util.load(RootedTree.java:240)

skipping some lines...

Traceback (most recent call last):
  File "data_analysis.py", line 110, in <module>
    qMatCTMC1, bDictCTMC1, treeCTMC1, nllkCTMC1 = opt_ctmc_full(qMatStart, multiAlign, javaDirectory, modelDirectory, eStepFile, parametersPath, inputLoc, outputLoc, dataLoc, execsLoc, rFileLoc, cList, qRates=[1.], suffix='_ctmc_1rate', updateQ=updateQ, tol=1.e-2, bTol=1.e-4, iterMax=100)
  File "/home/jpalmer/geopip/src/ctmc_est.py", line 95, in opt_ctmc_full
    qMatNew, piProbNew = opt_qmat_em_full(qMat, cList, inputLoc, outputLoc, javaDirectory, modelDirectory, eStepFile, parametersPath, execsLoc)
  File "/home/jpalmer/geopip/src/pip_est.py", line 213, in opt_qmat_em_full
    llh = llh_from_llhfiles(llhFiles)
  File "/home/jpalmer/geopip/src/io_json.py", line 454, in llh_from_llhfiles
    fileTem = open(llhFile)
IOError: [Errno 2] No such file or directory: '/home/jpalmer/geopip/result/data_analysis/runs/7/output/all/llh.txt'

I've ensured that I'm running JDK8 and have the libgfortran3 package installed.

One difference I've noticed is that their FASTA file molluscan.fasta is formatted differently:

Not sure if that plays any role.