broadinstitute / imaging-backup-scripts

Scripts to backup data for the Imaging Platform
MIT License
1 stars 3 forks source link

Figure out and document upper limit for restore_intellient #19

Open bethac07 opened 2 years ago

bethac07 commented 2 years ago

From here(needs jump repo access)

Yeah looking at CPU utilization I think I'm right; even though AFAICT the workers call from the main API is being passed correctly into tqdm, it was not using 95 workers - look at the bump in CPU and network out in the last third there when I start two more tmux sessions with 7 total more calls to the python script. We certainly may hit Amazon's API limit eventually, the not-as-ideal part with this script is that we won't know until the end of each one how many are errors (but we can scrape them from the CSV and try again), but I'm just glad that it doesn't seems like 500-700 API requests a minute is a hard cutoff! image

I'm not sure if this is a tqdm hard cutoff or what; I don't think it's worth trying to increase if it's not SUPER easy to do, because it's easy to solve with parallel, but we should at least dig out the reason.