hipspy / hips

Python library to handle HiPS
https://hips.readthedocs.io
12 stars 16 forks source link

Improve parallel tile fetching #107

Open cdeil opened 7 years ago

cdeil commented 7 years ago

@adl1995 added a first working version of parallel tile fetching in #106 . This is a reminder issue with my thoughts how to improve it in one or more follow-up pull requests.

The most important improvement would be to implement error handling for timeout and missing tiles as discussed in the last telcon.

Another change I would suggest is to actually restructure the code so that the high-level fetch_tiles function creates a list of URLs from the list of HipsTileMeta, and then only passes that list of URLs to the other functions, and they return the raw data of the response or error information. At the moment the HTTP fetch is mixed in with HipsTileMeta / HipsTile for no good reason, this only happened because at the time we didn't understand how to keep track which Future corresponds to which request. There's a few ways to do it, maybe the dict solution in the ThreadPoolExecutor docs in the Python standard lib docs is a good one? (I just didn't understand at first why a dict is needed there at all). Or maybe it would be simpler to always return a dict from each request that contains the URL it was given as input, along with data or error information?

@adl1995 - Can you please implement one or both of those suggestions? My guess is that it would be easiest (and about 1 day of work, i.e. not too bad) to do both in one PR, and coding-wise to do the refactor to only pass URL to the fetch code and return dicts from each request first, and then add in the error info / handling second (you'll already have the dicts in place where you can add them).