AlexsLemonade / refinebio

Refine.bio harmonizes petabytes of publicly available biological data into ready-to-use datasets for cancer researchers and AI/ML scientists.
https://www.refine.bio/
Other
125 stars 18 forks source link

Unblock refinebio processing #3478

Open davidsmejia opened 3 weeks ago

davidsmejia commented 3 weeks ago

Context

This will probably be converted to an epic but I just want to layout some procedural steps required to get to a point where we can determine why processing is failing.

One challenge while trying to debug this problem is that the errors doesn't seem to be reproducible locally. At least I have had trouble creating tests that capture the same errors (seeing different failure reasons). Additionally, logs are not perpetually available on staging which makes jumping back into debugging difficult because it requires knowing the experiment that triggers the error. This can be resolved with a little bit more organization.

For now, we should clean up staging and go clean slate. From there break up all (presumably processable) experiments. And run them more slowly to attempt to reproduce locally.

From the errors that are generated (we should expect there to be errors because there always are), we need to open issues that capture the logs so we can implement a fix if appropriate.

Solution or next step