Closed adl1995 closed 7 years ago
@adl1995 - Thanks!
Please put this on the list of things to discuss in the telcon this afternoon.
For sure, this code shouldn't be located in hips/draw/paint.py
.
As discussed since the start of the project (and started by you in #33), the tile fetching / caching should be in it's own file. We've discussed it many times, but it hasn't happened so far, we should really make it a priority for the last week.
One reason that it's difficult is that (I think) there's a fundamental decision to be made whether we are happy with two phases that happen after each other (first fetch all tiles, then do all drawing), or whether we want to support async tile fetching and immediately and async tile drawing as the tiles come it. That second option is faster, and also much more memory-efficient (can delete tiles that are drawn from memory immediately), but also leads to tighter coupling between the fetching / caching and drawing code. My suggestion for now (and I think has always been) would be that we implement the simple solution and move the tile fetching out to a separate Python file, and have it as much as possible decoupled from the drawing, i.e. the drawer just gives a list of tiles needed to the fetcher and is handed back that list of tiles, i.e. there's two lines of code in the painter calling the fetcher.
The second question / suggestion I have is whether introducing aiohttp
as a dependency is really needed, or whether an async or at least parallel via threads fetch wouldn't work for our simple application and we would avoid the extra dependency. There's e.g. this:
https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor-example
@adl1995 - I know you have looked at this before GSoC started. Please remind us of the results this afternoon and then we discuss.
Also: if we go the simple route of separating out the fetching, then we could simply have two different fetchers, one using aiohttp
as an optional dependency and the other just the Python standard lib to fetch, and the user could select via a simple option which one to use (probably defaulting to the Python standard lib, and only offering aiohttp
as opt-in for power-users that really need the extra little bit of performance.
@cdeil I have added asynchronous tile fetching using both urllib
and aiohttp
, however, the test fails at:
https://travis-ci.org/hipspy/hips/jobs/265656774
I think this is due to the reason that tiles are not fetched sequentially and when they are matched with HipsTileMeta
, they are misaligned. https://github.com/hipspy/hips/pull/106/files#diff-27bdb9fa08176e9f36aa86205b78af6eR71
What do you think is the best solution to this? Should I somehow label each tile, so I can sort them once they are fetched?
This PR has changes to astropy_helpers
which shouldn't be here. Please fix.
Concerning your question about the current bug that tilemeta / tile isn't matched correctly: I don't know.
Please add a unit test that fails and address the other points mentinoed above, and then I'll have a look.
If you have an idea how to fix the issue, just go for it and then I'll review.
@cdeil I've made the updates, however, I still don't know how to enforce order when fetching tiles, so the tests fail.
Also, can you please explain how I could remove changes from astropy_helpers
?
Also, can you please explain how I could remove changes from astropy_helpers?
Here's one way to do it:
cd astropy_helpers
git checkout 14ca346 # This is the version hips master points to
cd ..
git status
git add -u
git commit -m 'Undo change to astropy_helpers'
Then to avoid introducing a messy git history where the astropy_helpers
version jumps back and forward again, I'd suggest to use
git fetch origin # might be called upstream for you, I don't know
git rebase -i origin/master
and to declare this new commit a fix-up commit on top of 05a130 which is the commit where you introduced the accidental change:
pick 6c12958 Introduce asynchronous fetching of HiPS tiles
pick 0eba105 Add option to fetch tiles asynchronously using urllib
pick 05a130f Add tqdm progress bar to urllib and aiohttp fetch methods
fixup 295434b Undo change to astropy_helpers
Finally, git push -f
the branch to Github and check the list of commits and diff there. Should be all good now.
You're probably aware, but just in case: get into the habit of always using git status
and git diff
to review changes before committing, to spare yourself (and me) the troubles of having to fix up your git history later.
I would suggest that the HipsTileFetcher
is given a Python list of HipsTileMeta
of tiles to fetch.
That is simpler and more flexible compared to what you have now.
It then also means that you can create the list of HipsTile
that you return more easily, because you already have the meta
.
For the async code, can't you have a system that makes it easy to attach the right data to the right meta? Maybe one option could be that you pass the meta
for each tile to the async function and that function hands back the complete tile
, no?
@cdeil Your suggestion on coupling HipsTileMeta
with the fetch method worked, thanks!
I think this is ready to be merged now. Handling the fetch errors and missing tiles could be a separate PR.
One more thing: I think it would be good to have at least two or three sentences of description in the fetcher saying what it does and at least one example how it works. Note that this is code-wise by far the most complex piece in our package (using threads and async and requests) and soon we will have to extend it with error handling in some way and I think it would be really valuable to have a short high-level description what it does and how it works for future developers.
Users will probably never use it directly. For them the options in make_sky_image
matter. There you currently only expose the choice whether to use urllib or aiohttp, but not n_parallel or timeout. I think it would be especially important to expose timeout
, because from experience I know that it can be super frustrating to have something time out if one a slow but working network, with no option to increase the timeout parameter. The options to expose it would be either individually, or as a dict "fetch_opts" where the user can pass e.g. fetch_opts=dict(package='urllib', timeout=30, n_parallel=2)
@cdeil I just made the updates. I have removed the HipsTileFetcher
class and now the fetch.py
file exports a fetch_tiles
function. I'm not sure if this is neat, as now I had to pass around arguments that were previously accessed using self
.
@cdeil I have refactored fetching tiles using urllib
, now tiles_urllib
function computes url
on the fly and passes it to executor.submit
. This works, but I have a feeling that tiles are not being fetched asynchronously, I say this because it took me 89 seconds for fetching tiles when building the docs.
@cdeil The example should be fixed now, there was a leading ,
after the url
, so it caused errors. The tile fetching using urllib
is now being done asynchronously.
@adl1995 - Thanks!
We could merge now if you like, but I have one remaining question / concern regarding how the async and thread parallel fetch is implemented
if progress_bar
up front, but in the async done differently. Is there a way to make the thread and async implementation more similar or is that not possible because with async / await it's not possible to do it equivalently?.result()
call, i.e. if a given tile is slow to fetch , but later ones are fast.@adl1995 - If you're interested in fully understanding threads and async and what your code is doing (or maybe even writing another blog post), you could spend some time and use e.g. https://github.com/astrofrog/psrecord to "record' and plot or print a log of what is going on. I don't know if that can monitor number of threads and network I/O, but probably https://pythonhosted.org/psutil/ or some other tools can, and you could set of a "benchmark", i.e. a small Python file that executes a few fetches and / or draws and then makes a printout or plot summarising what is going on. This is something I wanted to do before implementing the faster tile drawing, but I won't have time for it. So if you're interested, take a day or two max for this "side project" and do that.
@cdeil There is an option to move the progress bar in the beginning in case of async, but it makes the code synchronous. The modified function is this:
async def fetch_all_tiles_aiohttp(tile_metas: List[HipsTileMeta],
hips_survey : HipsSurveyProperties, progress_bar: bool) -> List[HipsTile]:
"""Generator function to fetch HiPS tiles from a remote URL using aiohttp."""
import aiohttp
async with aiohttp.ClientSession() as session:
if progress_bar:
from tqdm import tqdm
tile_metas = tqdm(tile_metas, total=len(tile_metas), desc='Fetching tiles')
tiles = []
for meta in tile_metas:
url = hips_survey.tile_url(meta)
task = asyncio.ensure_future(fetch_tile_aiohttp(url, meta, session))
tiles.append(await task)
return tiles
@cdeil I made the updates as discussed in the call. However, there is a small problem. When using asyncio.as_completed
the tiles are not fetched in sequence, this is no issue during drawing, but in test_fetch.py
I assert on the first tile, which can be at a different index each time.
So, for now, I just fetched a single tile and asserted on that. I found this question on SO, but did not try to implement this, because this would create an extra overhead which isn't exactly needed, except for the test_fetch.py
file.
@adl1995 - I've done one commit 1fd6e7f to (hopefully) improve the code a little bit.
It's minor changes, mostly to avoid very long lines.
One change is that I managed to get rid of the no-op "progress_bar", by simply wrapping an iterable of futures with the tqdm
call, again returning an iterable.
I think / hope it's still working as before.
One thing I noticed is that for the async, you haven't actually implemented the n_parallel
and timeout
options. Can you please add it?
Both are really important for the caller to have control over.
@cdeil Can you please tell how I could pull in your commit? I messed up my Git history.
The branch on Github is OK and contains my commit: https://github.com/hipspy/hips/pull/106/commits
You could just delete your local branch, and check out a new local branch from this latest remove version, and then continue coding there, no?
@cdeil I have added timeout
and n_parallel
option to tiles_aiohttp
function. I'm not sure why you removed asyncio.as_completed
, doesn't this mean that if one tile takes too long to fetch, others have to wait?
@cdeil The tests fail with:
ImportError: /home/travis/miniconda/envs/test/lib/python3.6/lib-dynload/../../libz.so.1: version `ZLIB_1.2.9' not found (required by /home/travis/miniconda/envs/test/lib/python3.6/site-packages/matplotlib/../../.././libpng16.so.16)
I'm not sure what causes this.
I have added timeout and n_parallel option to tiles_aiohttp function.
Thanks!
I'm not sure why you removed asyncio.as_completed, doesn't this mean that if one tile takes too long to fetch, others have to wait?
I don't think I did remove it, just moved it up a few lines, no?
The tests fail ...
This looks like a Linux install issue of our dependencies, not like something related to this PR. I've restarted travis-ci to see if it was a fluke.
travis-ci passed, it was a fluke.
I've done a final review and some small improvements in b4805ad :
aiohttp.TcpConnector
because I think it's a bit simpler then the asyncio.Semaphore
(one less level of code indentation, and more powerful (has other options to control requests that we might want to use in the future).Merging this now, and then I'll make a issue with my thoughts how to improve fetch.py
further for you @adl1995 in a follow-up PR.
As the title states, HiPS tiles are now fetched asynchronously. I noted 4 seconds for tile fetching which previously took around 25 seconds at my end. For now, this is introduced in the
HipsPainter
class, but let me know if this needs to be refactored somewhere else.