robmaz / distmap

Sequence alignment on Hadoop
0 stars 1 forks source link

Should DownloadDistmapResult check the _SUCCESS file before download? #88

Closed magicDGS closed 6 years ago

magicDGS commented 6 years ago

@robmaz - I found in our wiki help for mapping on the cluster that your script checks for the _SUCCESS file before download from the cluster with ReadTools. It is quite easy to implement this check there and avoid the hadoop fs -ls /${pref}_myname/fastq_paired_end_mapping_bwa/_SUCCESS command for the user.

In addition, because this could have some problems to download files if the _SUCCESS is not present, we can have an advance argument to disable this check (e.g., download the data from something that fails except in one block, identifying how many reads were lost and evaluating if it is worthy to remap).

Tell me what do you think about this - it is really easy to implement as a bug fix, because it is true that downloading a not successful job is not desirable.

robmaz commented 6 years ago

I think it would be a good idea to fail with a reasonably descriptive error if the _SUCCESS file is not present by default. And then maybe have a --force option that changes the error into a warning?

Cheers Rupert

2018-06-06 9:22 GMT+02:00 Daniel Gómez-Sánchez notifications@github.com:

Assigned #88 https://github.com/robmaz/distmap/issues/88 to @robmaz https://github.com/robmaz.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/robmaz/distmap/issues/88#event-1665402302, or mute the thread https://github.com/notifications/unsubscribe-auth/Ad_FfPDhAjH6kKZNj4TgnwPo_T-8pSSkks5t54MfgaJpZM4UcHfC .

magicDGS commented 6 years ago

Version 1.4.1 of ReadTools have this patch.