chanzuckerberg / czid-web

Infectious Disease Sequencing Platform
https://czid.org/
MIT License
79 stars 24 forks source link

[IDSEQ-2843] Fix fetch of pipeline outputs from s3 #3377

Closed tfrcarvalho closed 4 years ago

tfrcarvalho commented 4 years ago

Description

Jira IDSEQ-2842

This bug was caused by a combinations of:

  1. flattening of results in s3
  2. by the number of align and coverage viz file and
  3. lack of pagination

This PR:

Tests

kislyuk commented 4 years ago

FYI the SDK has built-in paginators for this that remove the need to handle the pagination tokens yourself.

tfrcarvalho commented 4 years ago

FYI the SDK has built-in paginators for this that remove the need to handle the pagination tokens yourself.

Thank you @kislyuk ! I actually looked last night and could not find reference to paginators in the ruby SDK but since you commented again on this I looked again and found the inner iterator. Maybe I was just too tired. Code updated.

cdebourcy commented 4 years ago

Description

Jira IDSEQ-2842

This bug was caused by a combinations of:

  1. flattening of results in s3
  2. by the number of align and coverage viz file and
  3. lack of pagination

This PR:

  • Add pagination to output fetching
  • Remove redundant s3 fetches since all files are in the same folder now.

Tests

  • Tested locally and confirmed that reads table loads and all outputs are listed in results folder.

Wow! Nice find, who would have thought.