kellrott / SMC-Het-Challenge-Eval

0 stars 1 forks source link

PyClone #8

Open galder-max opened 5 years ago

galder-max commented 5 years ago

Ke will update the docker for PyClone.

galder-max commented 5 years ago

Just sent another reminder.

galder-max commented 5 years ago

@kellrott "The updated PyClone docker can be find here.

Updated: https://github.com/keyuan/docker-pyclone/tree/smc-het-dev Original: https://github.com/keyuan/docker-pyclone/tree/smc-het-old

There is only one version, “pyclone.xml”, which uses Battenberg purity.

The major differences are the following

  1. There is a filter for false positive (CCF = 0) variants.
  2. Minor improvements in post-processing steps for subclonal events."
kellrott commented 5 years ago

Issues in building docker image:

Step 5/9 : RUN pip install pyyaml     && cd /home/pipeline/     && tar xvfz /home/pipeline/PyDP-0.2.2.tar.gz     && cd /home/pipeline/PyDP-0.2.2/     && python setup.py install     && cd /home/pipeline/     && tar xvfz /home/pipeline/pyclone_ke.tar.gz     && cd /home/pipeline/pyclone/     && python setup.py install     && cd /home/

I get the error:

 ---> Running in afb93d202bfd
Collecting pyyaml
Exception:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/pip/basecommand.py", line 209, in main
    status = self.run(options, args)
  File "/usr/lib/python2.7/dist-packages/pip/commands/install.py", line 328, in run
    wb.build(autobuilding=True)
  File "/usr/lib/python2.7/dist-packages/pip/wheel.py", line 748, in build
    self.requirement_set.prepare_files(self.finder)
  File "/usr/lib/python2.7/dist-packages/pip/req/req_set.py", line 360, in prepare_files
    ignore_dependencies=self.ignore_dependencies))
  File "/usr/lib/python2.7/dist-packages/pip/req/req_set.py", line 512, in _prepare_file
    finder, self.upgrade, require_hashes)
  File "/usr/lib/python2.7/dist-packages/pip/req/req_install.py", line 273, in populate_link
    self.link = finder.find_requirement(self, upgrade)
  File "/usr/lib/python2.7/dist-packages/pip/index.py", line 442, in find_requirement
    all_candidates = self.find_all_candidates(req.name)
  File "/usr/lib/python2.7/dist-packages/pip/index.py", line 400, in find_all_candidates
    for page in self._get_pages(url_locations, project_name):
  File "/usr/lib/python2.7/dist-packages/pip/index.py", line 545, in _get_pages
    page = self._get_page(location)
  File "/usr/lib/python2.7/dist-packages/pip/index.py", line 648, in _get_page
    return HTMLPage.get_page(link, session=self.session)
  File "/usr/lib/python2.7/dist-packages/pip/index.py", line 757, in get_page
    "Cache-Control": "max-age=600",
  File "/usr/share/python-wheels/requests-2.9.1-py2.py3-none-any.whl/requests/sessions.py", line 480, in get
    return self.request('GET', url, **kwargs)
  File "/usr/lib/python2.7/dist-packages/pip/download.py", line 378, in request
    return super(PipSession, self).request(method, url, *args, **kwargs)
  File "/usr/share/python-wheels/requests-2.9.1-py2.py3-none-any.whl/requests/sessions.py", line 468, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/share/python-wheels/requests-2.9.1-py2.py3-none-any.whl/requests/sessions.py", line 576, in send
    r = adapter.send(request, **kwargs)
  File "/usr/share/python-wheels/CacheControl-0.11.5-py2.py3-none-any.whl/cachecontrol/adapter.py", line 46, in send
    resp = super(CacheControlAdapter, self).send(request, **kw)
  File "/usr/share/python-wheels/requests-2.9.1-py2.py3-none-any.whl/requests/adapters.py", line 376, in send
    timeout=timeout
  File "/usr/share/python-wheels/urllib3-1.13.1-py2.py3-none-any.whl/urllib3/connectionpool.py", line 610, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/share/python-wheels/urllib3-1.13.1-py2.py3-none-any.whl/urllib3/util/retry.py", line 228, in increment
    total -= 1
TypeError: unsupported operand type(s) for -=: 'Retry' and 'int'
You are using pip version 8.1.1, however version 19.1.1 is available.
kellrott commented 5 years ago

Adding the line

RUN apt-get update && apt-get upgrade -y python-pip

fixes the issue

kellrott commented 5 years ago

Inside the /home/pipeline/run_analysis_pyclone.R there is a line for

vcfParserPath <- dir(path = getwd(), pattern = "create_ccfclust_inputs.py", full.names = T)

This doesn't work if the code is executing in another working directory.

keyuan commented 5 years ago

I've just pushed a fix https://github.com/keyuan/docker-pyclone/commit/787d5f03dbe350db71fa18343a7f62f554c5f7be for this

kamichiotti commented 4 years ago

I am having problems with running docker-pyclone (the smc-het-dev branch) on the cluster; although I have been able to get it to work if I build keyuan/ccube and running locally. The error I have been getting refers to inability to find the ParseSnvCnaBattenberg function. When working interactively within a docker-pyclone container, I find the function is not available. I have been able to track the issue down to the Dockerfile building off the keyuan/docker-ccube base (which in turn builds off keyuan/ccube; keyuan ccube does contain all of the functions require by docker-pyclone).

I originally chalked this up to the docker-pyclone Dockerfile pulling an old version of docker-ccube that doesn't have all of the functions now available in ccube, so I rebuilt the docker-ccube image on the latest version of ccube. This did solve the issue of docker-pyclone finding all of the required functions, but now I am getting additional errors primarily centered around the availability of devtools for R.

I have made multiple attempts at getting devtools to install correctly, but when I get it to install, there is always at least one of several other packages missing due to compatibility errors with R version 3.4. At this point, I think the Dockerfile needs to be rewritten, possibly upgrading R to work with current versions of devtools, mcclust, etc.

kamichiotti commented 4 years ago

The exact error I'm getting now when I run keyuan/docker-pyclone via cwltool is:

Warning message:
In left_join_impl(x, y, by$x, by$y, suffix$x, suffix$y) :
  joining factor and character vector, coercing into character vector
Warning message:
In left_join_impl(x, y, by$x, by$y, suffix$x, suffix$y) :
  joining factor and character vector, coercing into character vector
Error in comp.psm(ltmat) : All elements of cls must be integers in 1:nobs
Execution halted

However of the 74 files the algorithm is being evaluated against, only 26 fail with this error; the remainder successfully complete. Of possible additional help to narrow down where the problem resides, it seems the 1B.txt and 1C.txt outputs must be being written, as the next errors I get pertain to a missing 2A.txt. So the disruption is likely happening during creation of the 2A.txt file.

Strangely, when I had mcmc and burnin set at 10 and 2, respectively, for testing purposes, I ran the entire set of samples and only one failed.

keyuan commented 4 years ago

Is it possible to share one of the problematic cases to check the bug?

On 24 Jan 2020, at 00:35, Kami Chiotti notifications@github.com wrote:

The exact error I'm getting now when I run keyuan/docker-pyclone via cwltool is:

Warning message: In left_join_impl(x, y, by$x, by$y, suffix$x, suffix$y) : joining factor and character vector, coercing into character vector Warning message: In left_join_impl(x, y, by$x, by$y, suffix$x, suffix$y) : joining factor and character vector, coercing into character vector Error in comp.psm(ltmat) : All elements of cls must be integers in 1:nobs Execution halted

However of the 74 files the algorithm is being evaluated against, only 26 fail with this error; the remainder successfully complete. Of possible additional help to narrow down where the problem resides, it seems the 1B.txt and 1C.txt outputs must be being written, as the next errors I get pertain to a missing 2A.txt. So the disruption is likely happening during creation of the 2A.txt file.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

kamichiotti commented 4 years ago

Absolutely. I'll email you the data for one of the tumors along with some of the outputs corresponding to that tumor.

Just to make sure we are on the same page, please be sure to read the comment I left under the "SVclone and CCube" issue. Due to the changes documented there, it was necessary to rebuild the pyclone image off of the updated ccube base image (smcheteval/ccube on DockerHub; builds from smc-het-challenge/docker-ccube on GitHub). The updated pyclone image is smcheteval/pyclone on DockerHub, which builds from smc-het-challenge/docker-pyclone on GitHub.

With this setup, I was able to get 56 of the tumors to run successfully (40 of which were also successful with ccube), but there are 18 that continue to give me this error:

Error in comp.psm(ltmat) : All elements of cls must be integers in 1:nobs
Execution halted
INFO [job pyclone.cwl] Max memory used: 0MiB
WARNING [job pyclone.cwl] completed permanentFail
WARNING Final process status is permanentFail

Also, is there an optimal number of threads I should be running with? I'm arbitrarily using 6 now.