Open sergpolly opened 5 years ago
Had the same problem. Here is what SRA say about it.
Thanks @Phlya! When I checked the presence of the sra links one by one, I also found that some of them were missing from ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR. When I removed the sra's that weren't present from my project.yml, my distiller project is running fine, but this doesn't solve the problem of course...
I just downloaded those missing ones manually... Wget download is much faster than the regular sra tools, but maybe in case of this problem distiller should fall back to fastq-dump?
Yes, I figured I will have to do manual download for now as well. Thanks!
https://github.com/mirnylab/distiller-nf/commit/2f259f58a69b3063d453a86d4bd5552c4d2c8d4c
this update will make distiller use fastq-dump if wget fails. Does it look good? Any other fixes we could implement to the downloading process (i.e. try multiple URLs), while we're at it?
On Thu, 14 Nov 2019 at 00:58, Marlies Oomen notifications@github.com wrote:
Yes, I figured I will have to do manual download for now as well. Thanks!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mirnylab/distiller-nf/issues/145?email_source=notifications&email_token=AAG64CRSM3HHVBSNNIOVLJDQTSID5A5CNFSM4JNDFOJ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEADBTA#issuecomment-553660620, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG64CUCJCV6RLSAGD252QLQTSID5ANCNFSM4JNDFOJQ .
Btw, here is another trick to force using fastq-dump, in project.yml specify input as: library1: lane1:
Thank you @golobor and @Phlya ! that was uber quick!
Does it look good?
we are about to try it - we'll let you know here
Any other fixes we could implement to the downloading process (i.e. try multiple URLs), while we're at it?
hmmmm - I do it so rarely that I don't really know what to say ... maybe @Phlya have suggestions ? MirnyLab people ?
If anything i would like a reminder why aren't we doing it the nextflow
way ? https://www.nextflow.io/docs/edge/channel.html#fromsra https://www.nextflow.io/blog/2019/release-19.03.0-edge.html - is it because we didn't have time to do it - or because there is something wrong with it ?
maybe it's worth switching, anyone is interested in implementing? :) i think, there was a time where it was returning non-gzipped files, but there has been much improvement lately, so we should probably consider switching.
On Thu, 14 Nov 2019 at 16:06, Sergey Venev notifications@github.com wrote:
Thank you @golobor https://github.com/golobor and @Phlya https://github.com/Phlya ! that was uber quick!
Does it look good?
we are about to try it - we'll let you know here
Any other fixes we could implement to the downloading process (i.e. try multiple URLs), while we're at it?
hmmmm - I do it so rarely that I don't really know what to say ... maybe @Phlya https://github.com/Phlya have suggestions ? MirnyLab people ? If anything i would like a reminder why aren't we doing it the nextflow way ? https://www.nextflow.io/docs/edge/channel.html#fromsra https://www.nextflow.io/blog/2019/release-19.03.0-edge.html - is it because we didn't have time to do it - or because there is something wrong with it ?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mirnylab/distiller-nf/issues/145?email_source=notifications&email_token=AAG64CUKZPLUEMYK2TT5UMLQTVSORA5CNFSM4JNDFOJ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEECEK2A#issuecomment-553928040, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG64CTIJKHXJFY3NY4OMGDQTVSORANCNFSM4JNDFOJQ .
worked for me on the original 2009 Hi-C data, - I guess @Marlies1993 would report here once she tries it as well:
Thank you, again!
maybe it's worth switching, anyone is interested in implementing? :)
sounds fun to me - should simplify some of the distiller.nf
code - not sure about timeline requirements though ...
Thanks for fixing this so quickly! All my sra's are downloading and mapping as well.
@Marlies1993 was running a distiller with some SRA-s as an input and the pipeline kept crashing at the sra step... After closer inspection it appears that some of the links of this form: https://github.com/mirnylab/distiller-nf/blob/01f6f7bbc4b1edfc3634c131f709b08a40164c74/distiller.nf#L176 are broken ...
for example, take
SRR027959
from 2009 hic paper:I don't know enough about sra-s and why are we downloading them using
wget
- anyone ?@Marlies1993 can comment and provide other examples if needed