kfuku52 / amalgkit

RNA-seq data amalgamation for a large-scale evolutionary transcriptomics
BSD 3-Clause "New" or "Revised" License
7 stars 1 forks source link

getfastq and prefetch 2.8.0. #18

Closed Hego-CCTB closed 3 years ago

Hego-CCTB commented 3 years ago

currently doing a fresh install of amalgkit and dependencies on a virtual machine and it looks like getfastq is currently incompatible with the latest version of prefetch.

it looks like prefetch got rid of the --output_directory parameter, now throwing an error in getfastq saying it doesn't know the parameter. The fix is simple, we just have to remove the --ouptut_directory parameter from amalgkit.

The question is, do we update amalgkit to the latest prefetch version, or do we just require an old version of prefetch instead?

kfuku52 commented 3 years ago

We should let amalgkit automatically changes its behavior depending on the prefetch versions. Cou.d you work on this? I'll push one commit today so please update your local later.

kfuku52 commented 3 years ago

Pushed. Done faster than I thought. Also, we've started managing the amalgkit version in init.py. Please update it when you push.

Hego-CCTB commented 3 years ago

gotcha!

Hego-CCTB commented 3 years ago

ah, found out that it was a different bug that caused prefetch to fail, which will be fixed with the next push.

prefetch 2.8. still complains about --output-directory not existing, but just ignores it.

Hego-CCTB commented 3 years ago

short update: even though the download finishes successfully in prefetch 2.8., amalgkit will throw an error afterwards, because it can't find the .sra in the working directory. prefetch 2.8. downloads sra data into /ncbi/public/sra/ unless the user changed the sratoolkit config itself.

I fixed both the parameter error and added a check for /ncbi/public/sra/ as well. So both legacy and stable prefetch versions are supported.