sigven / pcgr

Personal Cancer Genome Reporter (PCGR)
https://sigven.github.io/pcgr
MIT License
255 stars 48 forks source link

Issues installing pcgrr in PCGR v2.0.0 #244

Closed cornejoem closed 4 months ago

cornejoem commented 4 months ago

Hello,

following the installation instructions for PCGR 2.0.0 via conda, I am encountering an issue when installing pcgrr. This is the exact command I am using:

#install PCGR
PCGR_VERSION="2.0.0"
#set up variables
PCGR_REPO="https://raw.githubusercontent.com/sigven/pcgr/v${PCGR_VERSION}/conda/env/lock/"
PLATFORM="osx"
#create conda envs in local directory
CONDA_SUBDIR=osx-64 conda create --prefix Software/pcgr_conda/pcgr --file ${PCGR_REPO}/pcgr-${PLATFORM}-64.lock
CONDA_SUBDIR=osx-64 conda create --prefix Software/pcgr_conda/pcgrr --file ${PCGR_REPO}/pcgrr-${PLATFORM}-64.lock
##despite errors while installing package, PCGR appears to have been installed correctly.

Yet, I receive the below listed errors and exceptions. I can do run PCGR, but no html output is generated using R. I would greatly appreciate any help.

Downloading and Extracting Packages:

Downloading and Extracting Packages:

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
ERROR conda.core.link:_execute(945): An error occurred while installing package 'bioconda::bioconductor-genomeinfodbdata-1.2.11-r43hdfd78af_1'.
Rolling back transaction: done
class: LinkError
message:
post-link script failed for package bioconda::bioconductor-genomeinfodbdata-1.2.11-r43hdfd78af_1
location of failed script: /.../.../Software/pcgr_conda/pcgrr/bin/.bioconductor-genomeinfodbdata-post-link.sh
==> script messages <==
<None>
==> script output <==
stdout: 
stderr: ++ dirname -- /.../.../Software/pcgr_conda/pcgrr/bin/installBiocDataPackage.sh
+ SCRIPT_DIR= /.../.../Software/pcgr_conda/pcgrr/bin/../share/bioconductor-data-packages
+ json=/ /.../.../Software/pcgr_conda/pcgrr/bin/../share/bioconductor-data-packages/dataURLs.json
++ yq '."genomeinfodbdata-1.2.11".fn' / /.../.../Software/pcgr_conda/pcgrr/bin/../share/bioconductor-data-packages/dataURLs.json
+ FN='"GenomeInfoDbData_1.2.11.tar.gz"'
++ yq '."genomeinfodbdata-1.2.11".urls[]'  /.../.../Software/pcgr_conda/pcgrr/bin/../share/bioconductor-data-packages/dataURLs.json
+ IFS=
+ read -r value
+ URLS+=($value)
+ IFS=
+ read -r value
+ URLS+=($value)
+ IFS=
+ read -r value
+ URLS+=($value)
+ IFS=
+ read -r value
++ yq '."genomeinfodbdata-1.2.11".md5' /.../.../Software/pcgr_conda/pcgrr/bin/../share/bioconductor-data-packages/dataURLs.json
+ MD5='"2a4cbfc2031992fed3c9445f450890a2"'
+ STAGING= /.../.../Software/pcgr_conda/pcgrr/share/genomeinfodbdata-1.2.11
+ mkdir -p  /.../.../Software/pcgr_conda/pcgrr/share/genomeinfodbdata-1.2.11
+ TARBALL= /.../.../Software/pcgr_conda/pcgrr/share/genomeinfodbdata-1.2.11/"GenomeInfoDbData_1.2.11.tar.gz"'
+ SUCCESS=0
+ for URL in '${URLS[@]}'
++ echo '"https://bioconductor.org/packages/3.18/data/annotation/src/contrib/GenomeInfoDbData_1.2.11.tar.gz"'
++ tr -d '"'
+ URL=https://bioconductor.org/packages/3.18/data/annotation/src/contrib/GenomeInfoDbData_1.2.11.tar.gz
++ echo '"2a4cbfc2031992fed3c9445f450890a2"'
++ tr -d '"'
+ MD5=2a4cbfc2031992fed3c9445f450890a2
+ curl -L https://bioconductor.org/packages/3.18/data/annotation/src/contrib/GenomeInfoDbData_1.2.11.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   416  100   416    0     0   5918      0 --:--:-- --:--:-- --:--:--  5942
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (60) SSL certificate problem: self-signed certificate in certificate chain
More details here: https://curl.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

return code: 60

kwargs:
{}

Traceback (most recent call last):
  File " /.../.../Software/anaconda3/lib/python3.11/site-packages/conda/exception_handler.py", line 17, in __call__
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File " /.../.../Software/anaconda3/lib/python3.11/site-packages/conda/cli/main.py", line 83, in main_subshell
    exit_code = do_call(args, parser)
                ^^^^^^^^^^^^^^^^^^^^^
  File " /.../.../Software/anaconda3/lib/python3.11/site-packages/conda/cli/conda_argparse.py", line 196, in do_call
    result = getattr(module, func_name)(args, parser)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File " /.../.../Software/anaconda3/lib/python3.11/site-packages/conda/notices/core.py", line 124, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File " /.../.../Software/anaconda3/lib/python3.11/site-packages/conda/cli/main_create.py", line 125, in execute
    return install(args, parser, "create")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File " /.../.../Software/anaconda3/lib/python3.11/site-packages/conda/cli/install.py", line 265, in install
    explicit(specs, prefix, verbose=not context.quiet, index_args=index_args)
  File " /.../.../Software/anaconda3/lib/python3.11/site-packages/conda/misc.py", line 141, in explicit
    txn.execute()
  File " /.../.../Software/anaconda3/lib/python3.11/site-packages/conda/core/link.py", line 349, in execute
    self._execute(
  File " /.../.../Software/anaconda3/lib/python3.11/site-packages/conda/core/link.py", line 965, in _execute
    raise CondaMultiError(
conda.CondaMultiErrorclass: LinkError
message:
post-link script failed for package bioconda::bioconductor-genomeinfodbdata-1.2.11-r43hdfd78af_1
location of failed script:  /.../.../Software/pcgr_conda/pcgrr/bin/.bioconductor-genomeinfodbdata-post-link.sh
==> script messages <==
<None>
==> script output <==
stdout: 
stderr: ++ dirname --  /.../.../Software/pcgr_conda/pcgrr/bin/installBiocDataPackage.sh
+ SCRIPT_DIR= /.../.../Software/pcgr_conda/pcgrr/bin/../share/bioconductor-data-packages
+ json= /.../.../Software/pcgr_conda/pcgrr/bin/../share/bioconductor-data-packages/dataURLs.json
++ yq '."genomeinfodbdata-1.2.11".fn'  /.../.../Software/pcgr_conda/pcgrr/bin/../share/bioconductor-data-packages/dataURLs.json
+ FN='"GenomeInfoDbData_1.2.11.tar.gz"'
++ yq '."genomeinfodbdata-1.2.11".urls[]'  /.../.../Software/pcgr_conda/pcgrr/bin/../share/bioconductor-data-packages/dataURLs.json
+ IFS=
+ read -r value
+ URLS+=($value)
+ IFS=
+ read -r value
+ URLS+=($value)
+ IFS=
+ read -r value
+ URLS+=($value)
+ IFS=
+ read -r value
++ yq '."genomeinfodbdata-1.2.11".md5'  /.../.../Software/pcgr_conda/pcgrr/bin/../share/bioconductor-data-packages/dataURLs.json
+ MD5='"2a4cbfc2031992fed3c9445f450890a2"'
+ STAGING= /.../.../Software/pcgr_conda/pcgrr/share/genomeinfodbdata-1.2.11
+ mkdir -p  /.../.../Software/pcgr_conda/pcgrr/share/genomeinfodbdata-1.2.11
+ TARBALL=' /.../.../Software/pcgr_conda/pcgrr/share/genomeinfodbdata-1.2.11/"GenomeInfoDbData_1.2.11.tar.gz"'
+ SUCCESS=0
+ for URL in '${URLS[@]}'
++ echo '"https://bioconductor.org/packages/3.18/data/annotation/src/contrib/GenomeInfoDbData_1.2.11.tar.gz"'
++ tr -d '"'
+ URL=https://bioconductor.org/packages/3.18/data/annotation/src/contrib/GenomeInfoDbData_1.2.11.tar.gz
++ echo '"2a4cbfc2031992fed3c9445f450890a2"'
++ tr -d '"'
+ MD5=2a4cbfc2031992fed3c9445f450890a2
+ curl -L https://bioconductor.org/packages/3.18/data/annotation/src/contrib/GenomeInfoDbData_1.2.11.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   416  100   416    0     0   5918      0 --:--:-- --:--:-- --:--:--  5942
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (60) SSL certificate problem: self-signed certificate in certificate chain
More details here: https://curl.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

return code: 60

kwargs:
{}

: <exception str() failed>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File " /.../.../Software/anaconda3/bin/conda", line 13, in <module>
    sys.exit(main())
             ^^^^^^
  File " /.../.../Software/anaconda3/lib/python3.11/site-packages/conda/cli/main.py", line 128, in main
    return conda_exception_handler(main, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File " /.../.../Software/anaconda3/lib/python3.11/site-packages/conda/exception_handler.py", line 388, in conda_exception_handler
    return_value = exception_handler(func, *args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../.../Software/anaconda3/lib/python3.11/site-packages/conda/exception_handler.py", line 20, in __call__
    return self.handle_exception(exc_val, exc_tb)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../.../Software/anaconda3/lib/python3.11/site-packages/conda/exception_handler.py", line 62, in handle_exception
    return self.handle_application_exception(exc_val, exc_tb)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../.../Software/anaconda3/lib/python3.11/site-packages/conda/exception_handler.py", line 78, in handle_application_exception
    self._print_conda_exception(exc_val, exc_tb)
  File "/.../.../Software/anaconda3/lib/python3.11/site-packages/conda/exception_handler.py", line 84, in _print_conda_exception
    print_conda_exception(exc_val, exc_tb)
  File "/.../.../Software/anaconda3/lib/python3.11/site-packages/conda/exceptions.py", line 1258, in print_conda_exception
    stderrlog.error("\n%r\n", exc_val)
  File "/.../.../Software/anaconda3/lib/python3.11/logging/__init__.py", line 1518, in error
    self._log(ERROR, msg, args, **kwargs)
  File "/.../.../Software/anaconda3/lib/python3.11/logging/__init__.py", line 1634, in _log
    self.handle(record)
  File "/.../.../Software/anaconda3/lib/python3.11/logging/__init__.py", line 1643, in handle
    if (not self.disabled) and self.filter(record):
                               ^^^^^^^^^^^^^^^^^^^
  File "/.../.../Software/anaconda3/lib/python3.11/logging/__init__.py", line 830, in filter
    result = f.filter(record)
             ^^^^^^^^^^^^^^^^
  File "/.../.../Software/anaconda3/lib/python3.11/site-packages/conda/gateways/logging.py", line 65, in filter
    record.msg = record.msg % new_args
                 ~~~~~~~~~~~^~~~~~~~~~
  File "/.../.../Software/anaconda3/lib/python3.11/site-packages/conda/__init__.py", line 104, in __repr__
    errs.append(e.__repr__())
                ^^^^^^^^^^^^
  File "/.../.../Software/anaconda3/lib/python3.11/site-packages/conda/__init__.py", line 58, in __repr__
    return f"{self.__class__.__name__}: {self}"
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../.../Software/anaconda3/lib/python3.11/site-packages/conda/__init__.py", line 62, in __str__
    return str(self.message % self._kwargs)
               ~~~~~~~~~~~~~^~~~~~~~~~~~~~
ValueError: unsupported format character 'T' (0x54) at index 2017
pdiakumis commented 4 months ago

Hi @cornejoem, thanks for trying out PCGR!

Can you delete the Software/pcgr_conda/pcgrr/ directory and then try re-installing the pcgrr conda environment? You might need to try that a few times, apologies. The issue is probably related to an intermittent connection error with the Bioconductor servers that host the GenomeInfoDbData_1.2.11 R data package, or something to do with curl and the way it checks certificates. This has been reported elsewhere: https://github.com/bcbio/bcbio-nextgen/issues/3676

cornejoem commented 4 months ago

Hello @pdiakumis, thank you for reaching out so quickly.

Unfortunately despite deleting the Software/pcgr_conda/pcgrr/ directory, refresh of my terminal session and computer restart, I still encounter the same error. I will continue to try installing pcgrr, but any pointers are appreciated!

pdiakumis commented 4 months ago

Sorry about that Elena. Could you also try the following: echo "insecure" >> $HOME/.curlrc Then open a new terminal session and try again, and get back to us if that doesn't work either.

I am on an M1 Mac and just tried re-installing both environments and it's working without issues from my side.

The script that runs the curl command that fails is under pcgrr/bin/installBiocDataPackage.sh. On line 26 you'll see it has curl -L $URL > $TARBALL.

Can you also let me know what version of curl you're running? Here's mine:

$ curl --version

curl 8.6.0 (x86_64-apple-darwin23.0) libcurl/8.6.0 (SecureTransport) LibreSSL/3.3.6 zlib/1.2.12 nghttp2/1.61.0
Release-Date: 2024-01-31
Protocols: dict file ftp ftps gopher gophers http https imap imaps ipfs ipns ldap ldaps mqtt pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: alt-svc AsynchDNS GSS-API HSTS HTTP2 HTTPS-proxy IPv6 Kerberos Largefile libz MultiSSL NTLM NTLM_WB SPNEGO SSL threadsafe UnixSockets
pdiakumis commented 4 months ago

If that doesn't work either, I would suggest giving the Docker installation a try if possible: https://sigven.github.io/pcgr/articles/installation.html#b--docker

The Docker image includes all conda environments pre-installed so at least you wouldn't need to download anything else.

cornejoem commented 4 months ago

echo "insecure" >> $HOME/.curlrc and a refresh of the terminal window might have done it. I was able to run at least one run to completion, with all expected output files. However, for others I receive an error that the execution is halted:

2024-07-03 12:48:12 - pcgr-report-generation - INFO - Calculating data for rainfall plot
Error in if (startsWith(m, "chr")) { : 
  missing value where TRUE/FALSE needed
Calls: <Anonymous> -> <Anonymous>
Execution halted

Yet, I have ran these files previously with pcgr 1.4.1 and no errors occured.

sigven commented 4 months ago

Hi Elena, Just to rule out that your error is not related to your Conda installation: for any of the error cases in question, are you experiencing the same errors with Docker/Apptainer? And if you do, could you possibly share the complete error log, run command and input file(s) of such a case? Could be you have caught another issue that is not related to the Conda installation.

Thanks for reporting!

Kind regards, Sigve

cornejoem commented 4 months ago

Hello,

I encountered the same error when running PCGR with Docker. I would be happy to share the error log, and run commands for each, however as these are human patient samples, I might need to check whether I can share the input file.

Many thanks! Elena

sigven commented 4 months ago

That's of course understandable. Just curious about the nature of the input somatic calls that causes the error message. Do the VCF files that fail contain calls from chromosome patches etc, meaning not only 1-22, X,Y? Also in the PCGR-annotated VCF files? Contrasting your input to the example VCFs that come with PCGR might give some clue; we might have missed something in our input check/validation routines.

Another tip is to turn on the --debug option, and look for potential errors in all intermediate log files.

Best, Sigve

cornejoem commented 4 months ago

Hello Sigve and Peter,

Can I send you the error logs and commands via email? I would include runs with PCGRv1.4.1 and v2 for the same input files. If you need anything else, please let me know.

BW Elena

pdiakumis commented 4 months ago

Hi Elena, Of course - peterdiakumis@gmail.com.

By the way, your error above occurs with this code: https://github.com/sigven/pcgr/blob/c2d47f0/pcgrr/R/mutational_signatures.R#L715

I don't think that code has changed between PCGR 1.4.1 and 2.0.0. One thing that has changed is the R version we use, since we've gone from v4.1.3 to v4.3.3. There have been some changes in the way the newer R version handles different data structures, so it could well be related to the R version.

Since you can't share the VCF input file, can you try running a couple diagnostic commands on it using bcftools (which should be installed in the pcgr conda environment).

vcf1="path/to/your/input1.vcf.gz"
# which chromosomes are in your VCF
bcftools view -H ${vcf1} | cut -f1 | uniq -c

I am wondering if there's anything fishy going on in there with a missing chromosome value or something. Anyway, we can talk more via email if needed :-)

sigven commented 4 months ago

Hi Elena, Sure thing, please do. peterdiakumis and sigven, both @gmail.com.

thanks!

best, Sigve