PacificBiosciences / FALCON_unzip

Making diploid assembly becomes common practice for genomic study
BSD 3-Clause Clear License
30 stars 18 forks source link

running test data error on first command #107

Closed rob123king closed 6 years ago

rob123king commented 6 years ago

I have installed falcon-unzip as below using the pre-compiled binaries and seems fine

cd /home/data/bioinf_resources/programming_tools/
#use virtualenv-2.7
unset PYTHONPATH
source falcontest/bin/activate
tar xvzf falcon-2017.11.02-16.04-py2.7-ucs4.tar.gz -C falcontest
export LD_LIBRARY_PATH=falcontest/lib:${LD_LIBRARY_PATH}
 export PYTHONPATH=/home/data/bioinf_resources/programming_tools/falcontest/lib/python2.7
pip install pandas
easy_install --upgrade numpy

#setpath for mummer nucmer and show-cords
export PATH=/home/data/bioinf_resources/programming_tools/mummer-3.9.4alpha:$PATH

How do I run the example data, what commands and config files to use? I can download the raw data from ENA using project codes below. Arabidopsis data: PRJNA314706 V. vinifera cv. Cabernet Sauvignon: PRJNA316730 I've downloaded the config files from below for a test run:

And changed the line in fc_unzip.cfg to below: smrt_bin=/home/data/bioinf_resources/programming_tools/falcontest/bin/

I've downloaded the example assemblies file which comes with the config files used:

Assume download the raw data and put paths in “input.fofn” but then how to start it..

I have downloaded the raw data for the arabidopsis assembly as a test first. I have updated the input.fofn file with locations and smrtanalysis/bin location in fc_unzip.cfg

The unzip.sh file first command has changed in the pre-built binaries. This is the file used in the paper to run start to finish but very first command I get an error. "This fc_track_reads.py" has become "fc_track_reads_htigs0.py" Change made but when run first command get the below error

fc_track_reads_htigs0.py
No handlers could be found for logger "pypeflow.simple_pwatcher_bridge"
Traceback (most recent call last):
  File "/home/data/bioinf_resources/programming_tools/falcontest/bin/fc_track_reads_htigs0.py", line 11, in <module>
    load_entry_point('falcon-unzip==0.4.0', 'console_scripts', 'fc_track_reads_htigs0.py')()
  File "/scratch/cdunn/fork/.git/LOCAL4/lib/python2.7/site-packages/falcon_unzip/mains/track_reads_htigs0.py", line 338, in main
  File "/home/data/bioinf_resources/programming_tools/falcontest/lib/python2.7/site-packages/pypeflow/simple_pwatcher_bridge.py", line 273, in refreshTargets
    self._refreshTargets(updateFreq, exitOnFailure)
  File "/home/data/bioinf_resources/programming_tools/falcontest/lib/python2.7/site-packages/pypeflow/simple_pwatcher_bridge.py", line 339, in _refreshTargets
    raise Exception(msg)
Exception: Some tasks are recently_done but not satisfied: set([Node(0-rawreads), Node(1-preads_ovl)])
pb-cdunn commented 6 years ago

export PYTHONPATH=/home/data/bioinf_resources/programming_tools/falcontest/lib/python2.7

That PYTHONPATH is wrong, and it shouldn't be needed.

Anyway, you have to look into a task directory to learn why that task failed.

Exception: Some tasks are recently_done but not satisfied: set([Node(0-rawreads)

That means you need to look for a stderr file somewhere under 0-rawreads/. Usually, the problem is an immediately obvious integration problem.