gjoseph92 / stackstac

Turn a STAC catalog into a dask-based xarray
https://stackstac.readthedocs.io
MIT License
245 stars 49 forks source link

500 Internal Server Error when requesting tiles from within docker container #103

Closed robintw closed 2 years ago

robintw commented 2 years ago

This is a separate issue to discuss the separate problem I was running into last week (partially discussed in #96 and #97). I am running stackstac.show from within a JupyterLab notebook running in a docker container.

When running show, I can see the checkerboard pattern, but not the data itself. When looking at developer tools, I can see that the tiles for the data itself is giving a 500 error:

image

Unfortunately, the content of the response is just:

500 Internal Server Error
Server got itself in trouble

Do you have any suggestions as to how we could debug this further, and get more information out of the server?

gjoseph92 commented 2 years ago

@robintw maybe try setting the JupyterLab log level to INFO or DEBUG in the log console in the UI? https://github.com/gjoseph92/stackstac/blob/d3a78c490860c85319a91bdf3712e58986593def/stackstac/show.py#L849-L851 Sometimes this gets error messages to show up for me. If this doesn't work for you, I'll add something to the code. I've found it very annoyingly hard to get logs out.

robintw commented 2 years ago

I tried setting the log level, and it didn't get me any useful information.

Interestingly, I now don't seem to be getting 500 errors - I'm not sure what the difference is, but I'm getting 200 responses instead, but the tiles seem to be blank. I think I've worked out what's going on though. Looking in developer tools, the URL that it's trying to get for a tile is:

http://127.0.0.1:5000/lab/tree/local-machine/repo/proxy/8000/39483fd3f0035a039ac0a1a15fe6cad7/7/74/105.png

This seems to be putting the /proxy/blah bit on the end of the path to the file that's open in JupyterLab. My current file is in the folder local-machine/repo/. If I try and go to that URL in a browser I get an error from Jupyter telling me that file doesn't exist. If I edit the URL to be this:

http://127.0.0.1:5000/proxy/8000/39483fd3f0035a039ac0a1a15fe6cad7/7/74/105.png

(by removing lab/tree/local-machine/repo/)

and view that in my browser, then I see a correct tile, with the right data showing.

So, it looks like there is some sort of bug in the code that generates the tile URLs. I'll have a look through the code later to see if I can spot where the problem is, but I thought I'd post this here now in case the fix was really simple for someone with more knowledge of the code base.

gjoseph92 commented 2 years ago

@robintw the code you're probably looking for is https://github.com/gjoseph92/stackstac/blob/d3a78c490860c85319a91bdf3712e58986593def/stackstac/show.py#L628-L634

When you're on on JupyterLab, what's the URL in browser?

robintw commented 2 years ago

Ok, I really don't understand this - it's started working fine now! I've run it multiple times, and restarted the docker container, and it all 'just works'.

For reference, the URL in my browser was http://127.0.0.1:5000/lab/tree/local-machine/lineaments/StackStac_Show_Example.ipynb, which works fine with that regex - it extracts the correct base part.

I've no idea what was going on, but I'll close this for now and re-open in future if it re-occurs.

gjoseph92 commented 2 years ago

The incorrect URLs makes me think of something else. The way we figure out the base JupyterLab URL is extremely hacky. AFAICT there's no way from the Python side to get the current URL for Jupyter, as much as I've dug through Jupyter internals. So instead we use ipyleaflet's window_url attribute, which is pulled by JavaScript from window.location and sent over the network to the Python side (that's the only way I could figure out to get it!). So basically, the map has to be fully rendered in JavaScript before we can properly set up the tiles URLs for the map layers. If this URL changes we should theoretically update things properly: https://github.com/gjoseph92/stackstac/blob/d3a78c490860c85319a91bdf3712e58986593def/stackstac/show.py#L579-L589

But if you see these issues again, I'd be curious to look at

m = stackstac.show(...)
m  # display it

m.window_url  # what's this?
robintw commented 2 years ago

I've run into this problem again now. I've no idea why it seemed to go away, but it is definitely back now.

The tiles that are being requested (according to the Developer Tools) have the URL:

http://127.0.0.1:5000/lab/tree/local-machine/repo/proxy/8000/6ffb8cc48f234eac51eb7fd4f86b373c/6/37/52.png

If I edit the URL to remove the lab/tree/local-machine/repo/ bit, then the tile displays in the web browser fine.

The output of m.window_url is:

'http://127.0.0.1:5000/lab/tree/local-machine/repo/notebooks/lineaments/Fractal%20Dimension%20v5%20-%20Mag%20Data%20-%20One%20Threshold%20-%20PC.ipynb'

which matches the web browser URL which is the same.

I really can't understand why this is happening, when I've tested the regex you're using and it works fine.

Any ideas?

robintw commented 2 years ago

I've worked out what the problem is here. Sometimes I was using a notebook whose path had notebooks in it, and that was meaning that the regular expression matched all the way up to that instance of notebook rather than stopping at the notebook/lab/voila bit nearer the start of the URL. Adding a lazy quantifier to the regex fixes this - I'll double-check it all and submit a PR later today.

gjoseph92 commented 2 years ago

🤦 thanks for digging into it @robintw! That certainly seems like the problem. A PR would be great.