jupyter / nbviewer

nbconvert as a web service: Render Jupyter Notebooks as static web pages
https://nbviewer.jupyter.org
Other
2.21k stars 546 forks source link

404 on notebook from GitHub releases assets #875

Open prihoda opened 4 years ago

prihoda commented 4 years ago

Hi, I'm getting a 404 on: https://nbviewer.jupyter.org/github/Merck/deepbgc/releases/download/v0.1.0/DeepBGC_Example_Result.ipynb

The URL is accessible: https://github.com/Merck/deepbgc/releases/download/v0.1.0/DeepBGC_Example_Result.ipynb

When I get the URL using wget, I see that it's redirecting to a specific AWS instance URL. That URL works, but I don't think it is permanent.

To Reproduce Steps to reproduce the behavior:

  1. Go to https://nbviewer.jupyter.org/
  2. Set URL to https://github.com/Merck/deepbgc/releases/download/v0.1.0/DeepBGC_Example_Result.ipynb
  3. Click Go!
  4. See error

Expected behavior Redirected URL to notebook is used.

krinsman commented 4 years ago

Here's what I think the issue is: the GitHub API looks at this URL, and thinks that it's invalid, because it's only expecting URLs of the form https://github.com/<github_user>/<repo_name>/blob/<branch>/<path_to_file> or https://github.com/<github_user>/<repo_name>/tree/<branch>/<path_to_directory>.

The path above doesn't fit into that neat framework, so the API thinks it's invalid (at least that is my guess).

That wouldn't be so much of an issue if the URL provider could grab these notebooks that the GitHub API won't work with, but the GitHubRedirectHandler grabs any GitHub url and sends it to the GitHub provider, even though the GitHub provider doesn't work with all GitHub URLs.

https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/github/handlers.py#L415 https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/github/handlers.py#L101 https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/github/handlers.py#L399

So I guess a fix could involve making the regex for the GitHubRedirectHandler less greedy?

There would be a lot of ways to implement it, I suppose the question is what would be a good choice which doesn't make the code too much more complicated/difficult to maintain.

krinsman commented 4 years ago

I could have sworn there was a similar issue (i.e. caused by the same problem) but I can't find it right now.