AlexsLemonade / refinebio

Refine.bio harmonizes petabytes of publicly available biological data into ready-to-use datasets for cancer researchers and AI/ML scientists.
https://www.refine.bio/
Other
129 stars 19 forks source link

add s3 to sra urls to check #3355

Closed davidsmejia closed 1 year ago

davidsmejia commented 1 year ago

Issue Number

N/A

Purpose/Implementation Notes

We are now pulling NCBI SRA data from s3. When we go to determine the transcriptome index length to process these files we need to use sra-stat to read the file. In order for that code path to execute we need to ensure that we define sra_file_input_path.

Note: I think it would be better instead of looking at the download source, to instead inspect the file to determine its type before determining index length.

Methods

n/a

Types of changes

Functional tests

n/a tested locally

Checklist

Screenshots

N/A