rkimoakbioinformatics / oakvar

Genomic variant analysis platform
Other
32 stars 8 forks source link

Problem updating dbsnp #17

Closed Alex-Karmazin closed 2 years ago

Alex-Karmazin commented 2 years ago

I try to update dbsnp second day. It take a while to get to 55% or little bit more and then I got errors or it stop downloading. It will be better to split big files in to smaller chunks and let the system to proceed with latest stable chunk. I got such error: """ packaging.version.InvalidVersion: Invalid version: '' [2022:08:05 16:42:39] Starting to install dbsnp:154.0.2.1... [2022:08:05 16:42:41] Downloading code archive of dbsnp:154.0.2.1... [2022:08:05 16:42:46] Extracting code archive of dbsnp:154.0.2.1... [2022:08:05 16:42:46] Downloading data of dbsnp:154.0.2.1... file_sizes: 55%|████████████▋ | 8.56G/15.5G [4:05:34<3:19:56, 581kB/s] Traceback (most recent call last): File "C:\ProgramData\Anaconda3\envs\openCravatPlugin\lib\site-packages\download\download.py", line 270, in _fetch_file raise Exception( Exception: Error: File size is 8561147904 and should be 15531585003* Please wait some time and try re-downloading the file again.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\ProgramData\Anaconda3\envs\openCravatPlugin\lib\site-packages\oakvar\webstore\webstore.py", line 104, in fetch_install_queue install_module( File "C:\ProgramData\Anaconda3\envs\openCravatPlugin\lib\site-packages\oakvar\module__init.py", line 500, in install_module raise e File "C:\ProgramData\Anaconda3\envs\openCravatPlugin\lib\site-packages\oakvar\module__init.py", line 480, in install_module download_data(args=args) File "C:\ProgramData\Anaconda3\envs\openCravatPlugin\lib\site-packages\oakvar\module\init__.py", line 339, in download_data download(args.get("data_url"), zipfile_path) File "C:\ProgramData\Anaconda3\envs\openCravatPlugin\lib\site-packages\oakvar\store\init__.py", line 87, in download download.download(url, fpath, kind="file", verbose=False, replace=True) File "C:\ProgramData\Anaconda3\envs\openCravatPlugin\lib\site-packages\download\download.py", line 119, in download _fetch_file( File "C:\ProgramData\Anaconda3\envs\openCravatPlugin\lib\site-packages\download\download.py", line 277, in _fetch_file raise RuntimeError( RuntimeError: Error while fetching file https://store.opencravat.org/modules/dbsnp/154.0.1/dbsnp.data.zip. Dataset fetching aborted. Error: Error: File size is 8561147904 and should be 15531585003* Please wait some time and try re-downloading the file again. """

rkimoakbioinformatics commented 2 years ago

@Alex-Karmazin Ah yes, since I now have total control of OakVar store, finally I can do that. Thanks for reminding me. I'll post here when it's done.

antonkulaga commented 2 years ago

I personally installed without issues, However, for the majority of people who self-host OakVar downloading huge dbspn is often a challenge due to its size and not always stable internet.

rkimoakbioinformatics commented 2 years ago

@Alex-Karmazin @antonkulaga OakVar v2.5.23 has been released. It can install a module by downloading chunks. I have uploaded dbsnp module in chunks as an example. ov module install dbsnp will show this in action. If the download of a chunk did not work, another ov module install dbsnp will resume from the failed chunk.

Alex-Karmazin commented 2 years ago

Thank you, it works even better than I expect. When there is a disconnect next time it starts not from the start of the current chunk but exactly from the point it stops downloading.