AlexsLemonade / refinebio

Refine.bio harmonizes petabytes of publicly available biological data into ready-to-use datasets for cancer researchers and AI/ML scientists.
https://www.refine.bio/
Other
129 stars 19 forks source link

Fix failing downloader_no_op_tests job #3079

Closed arkid15r closed 1 year ago

arkid15r commented 2 years ago

Context

The error message(s):

https://github.com/AlexsLemonade/refinebio/actions/runs/3109588470/jobs/5039930709#step:9:5124

#17 169.6 configure: Package dependency requirement 'libgit2 >= 0.26.0' could not be satisfied.
#17 169.6 
#17 169.6   -----------------------------------------------------------------------
#17 169.6 
#17 169.6    Unable to find the libgit2 library on this system. Building 'git2r'
#17 169.6    using the bundled source of the libgit2 library.
#17 169.6 
#17 169.6    To build git2r with a system installation of libgit2, please install:
#17 169.6      libgit2-dev   (package on e.g. Debian and Ubuntu)
#17 169.6      libgit2-devel (package on e.g. Fedora, CentOS and RHEL)
#17 169.6      libgit2       (Homebrew package on OS X)
#17 169.6    and try again.
#17 169.6 
#17 169.6    If the libgit2 library is installed on your system but the git2r
#17 169.6    configuration is unable to find it, you can specify the include and
#17 169.6    lib path to libgit2 with:
#17 169.6 
#17 169.6    given you downloaded a tar-gz archive:
#17 169.6    R CMD INSTALL git2r-.tar.gz --configure-vars='INCLUDE_DIR=/path/to/include LIB_DIR=/path/to/lib'
#17 169.6 
#17 169.6    or cloned the GitHub git2r repository into a directory:
#17 169.6    R CMD INSTALL git2r/ --configure-vars='INCLUDE_DIR=/path/to/include LIB_DIR=/path/to/lib'
#17 169.6 
#17 169.6    or download and install git2r in R using
#17 169.6    install.packages('git2r', type='source', configure.vars='LIB_DIR=-L/path/to/libs INCLUDE_DIR=-I/path/to/headers')
#17 169.6 
#17 169.6    On macOS, another possibility is to let the configuration
#17 169.6    automatically download the libgit2 library from the Homebrew
#17 169.6    package manager with:
#17 169.6 
#17 169.6    R CMD INSTALL git2r-.tar.gz --configure-vars='autobrew=yes'
#17 169.6    or
#17 169.6    R CMD INSTALL git2r/ --configure-vars='autobrew=yes'
#17 169.6    or
#17 169.6    install.packages('git2r', type='source', configure.vars='autobrew=yes')
#17 169.6 
#17 169.6   -----------------------------------------------------------------------
#17 169.6 
#17 169.6 
#17 169.6 configure: Attempting configuration of bundled libgit2
#17 169.6 checking size of void*... 8
#17 169.8 checking for zlib... yes
#17 169.8 checking for openssl... yes
#17 169.8 checking for libssh2... no
#17 169.8 configure: WARNING:
#17 169.8   ---------------------------------------------
#17 169.8    Unable to find the LibSSH2 (ver >= v1.8)
#17 169.8    library on this system. Building git2r
#17 169.8    without support for SSH transport.
#17 169.8 
#17 169.8    To build with SSH support, please install:
#17 169.8      libssh2-1-dev (package on e.g. Debian and Ubuntu)
#17 169.8      libssh2-devel (package on e.g. Fedora, CentOS and RHEL)
#17 169.8      libssh2 (Homebrew package on OS X)
#17 169.8    and try again.
#17 169.8 
#17 169.8    If the LibSSH2 library is installed on
#17 169.8    your system but the git2r configuration
#17 169.8    is unable to find it, you can specify
#17 169.8    the include and lib path to LibSSH2 with:
#17 169.8    R CMD INSTALL git2r --configure-vars='LIBS=-L/path/to/libs CPPFLAGS=-I/path/to/headers'

https://github.com/AlexsLemonade/refinebio/actions/runs/3109588470/jobs/5039930709#step:9:8555

Traceback (most recent call last):
  File "/home/user/data_refinery_workers/downloaders/sra.py", line 74, in _download_file_http
    download_file(download_url, target_file_path)
  File "/usr/local/lib/python3.6/dist-packages/data_refinery_common/utils.py", line 415, in download_file
    download_file(download_url, target_file_path, retry + 1)
  File "/usr/local/lib/python3.6/dist-packages/data_refinery_common/utils.py", line 415, in download_file
    download_file(download_url, target_file_path, retry + 1)
  File "/usr/local/lib/python3.6/dist-packages/data_refinery_common/utils.py", line 415, in download_file
    download_file(download_url, target_file_path, retry + 1)
  [Previous line repeated 6 more times]
  File "/usr/local/lib/python3.6/dist-packages/data_refinery_common/utils.py", line 417, in download_file
    raise e
  File "/usr/local/lib/python3.6/dist-packages/data_refinery_common/utils.py", line 401, in download_file
    with requests.get(download_url, stream=True) as r:
  File "/usr/local/lib/python3.6/dist-packages/requests/api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/requests/sessions.py", line 528, in request
    prep = self.prepare_request(req)
  File "/usr/local/lib/python3.6/dist-packages/requests/sessions.py", line 466, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "/usr/local/lib/python3.6/dist-packages/requests/models.py", line 316, in prepare
    self.prepare_url(url, params)
  File "/usr/local/lib/python3.6/dist-packages/requests/models.py", line 390, in prepare_url
    raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL 'some_path': No schema supplied. Perhaps you meant http://some_path?
F2022-09-23 00:48:37,330 local data_refinery_workers.downloaders.utils DEBUG [downloader_job: 11]: Starting Downloader Job.
2022-09-23 00:48:37,356 local data_refinery_workers.downloaders.sra DEBUG [downloader_job: 11]: Downloading file from era-fasp@fasp.sra.ebi.ac.uk:/vol1/fastq/SRR160/001/SRR1603661/SRR1603661_1.fastq.gz to /home/user/data_store/SRR1603661/SRR1603661/SRR1603661_1.fastq.gz via Aspera.
2022-09-23 00:52:39,065 local data_refinery_workers.downloaders.sra DEBUG [downloader_job: 11]: Downloading file from era-fasp@fasp.sra.ebi.ac.uk:/vol1/fastq/SRR160/001/SRR1603661/SRR1603661_2.fastq.gz to /home/user/data_store/SRR1603661/SRR1603661/SRR1603661_2.fastq.gz via Aspera.
2022-09-23 00:56:45,964 local data_refinery_common.job_management DEBUG [downloader_job: 11] [processor_job: 5]: Queuing processor job.
2022-09-23 00:56:45,964 local data_refinery_workers.downloaders.utils DEBUG [downloader_job: 11]: Downloader Job completed successfully.
2022-09-23 00:56:45,978 local data_refinery_workers.downloaders.utils DEBUG [downloader_job: 11]: Downloader Job completed successfully.
.2022-09-23 00:56:46,003 local data_refinery_workers.downloaders.utils DEBUG [downloader_job: 12]: Starting Downloader Job.
2022-09-23 00:56:46,028 local data_refinery_workers.downloaders.transcriptome_index DEBUG [downloader_job: 12]: Downloading file from ftp://ftp.ensemblgenomes.org/pub/release-37/plants/gtf/aegilops_tauschii/Aegilops_tauschii.ASM34733v1.37.gtf.gz to /home/user/data_store/Aegilops_tauschiiASM34733v137_long/Aegilops_tauschii.ASM34733v1.37.gtf.gz.
2022-09-23 00:56:48,888 local data_refinery_workers.downloaders.transcriptome_index DEBUG [downloader_job: 12]: Files downloaded successfully.
2022-09-23 00:56:48,892 local data_refinery_workers.downloaders.utils DEBUG [downloader_job: 12]: Downloader Job completed successfully.
.
======================================================================
FAIL: test_download_file_ncbi (data_refinery_workers.downloaders.test_sra.DownloadSraTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/user/data_refinery_workers/downloaders/test_sra.py", line 84, in test_download_file_ncbi
    self.assertTrue(result)
AssertionError: False is not true

======================================================================
FAIL: test_download_file_swapper (data_refinery_workers.downloaders.test_sra.DownloadSraTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/user/data_refinery_workers/downloaders/test_sra.py", line 111, in test_download_file_swapper
    self.assertTrue(result)
AssertionError: False is not true

Solution or next step

Fix the tests.

arkid15r commented 1 year ago

Obsolete