Open griffin-h opened 4 months ago
I've fixed this issue before here where I just sleep the code until the API lets us query again. A solution like this will ensure completeness of SAGUARO's TNS data "copy". Although, I understand why something like this could be suboptimal for code we want to finish quickly (like the vetting code).
So, I'm thinking that we may want to record our position in the TNS query when the API throttles our usage in the vetting code, finish the vetting code with just the objects that we have pulled in, and then repeat everything until we finish querying TNS. This way the code will always be running and it gives us a complete sample of TNS objects.
Thoughts? We can also discuss this more Thursday.
I think this is a different situation, because we are not attempting to make a full copy of the TNS here. We already make a copy of their targets table hourly (outside the TOM). This is just to retrieve photometry for any new targets. This will never be complete anyway, because people could always add new photometry. For that reason, I don't think we need to worry about keeping track of whether a target failed.
I would suggest just trying 1 or 2 more times with a short sleep in between, and then giving up on the TNS portion if it still doesn't work. Our main concerns are (1) not letting it take so long that people get frustrated if the page doesn't reload quickly and (2) not letting the ingest_tns
script crash before vetting all the transients.
Just summarizing our discussion in person about this issue and how we want to fix it:
timelimit
and will have a default of 5s, so the query will never wait long than that when the webpage is being reloaded. Then, for the cron job, we can just set it to be like np.inf
and it will continuously wait until we have all the data.
Right now, the vetting code completely crashes with a JSON decoding error when the TNS throttles our API usage. We should check for that case and just skip over it, so that the rest of the vetting code can continue.