nilsnolde / wordpress-markdown-git

:loop: WordPress plugin to add file content (Markdown, Jupyter notebooks) from a Git based VCS to a WordPress post; replaces https://github.com/gis-ops/md-github-wordpress
GNU General Public License v3.0
48 stars 14 forks source link

404 - Not Found for valid Jupyter notebook URLs #17

Closed gkamradt closed 4 years ago

gkamradt commented 4 years ago

Versions Wordpress: 5.4.2 Documents for Git: 1.0.2

Describe the bug Given a valid URL - the page renders with "404-not found" in the block of the shortcode once loaded. The rest of the post loads fine. I'm happy to go down a debug path myself. I'm new and not sure where to start!

Paste the shortcode and the censored contents of config.json [git-github-jupyter url="https://github.com/Data-Indepedent/pandas-everything/blob/master/pandas_functions/Pandas_Duplicated.ipynb"]

config.json
{
  "limit": 5,
  "classes": "",
  "Github": {
    "user": "gkamradt",
    "token": "*****************************"
  },
  "Bitbucket": {
    "user": "",
    "token": ""
  },
  "Gitlab": {
    "user": "",
    "token": ""
  }
}

To Reproduce Steps to reproduce the behavior:

  1. Push file to git hub
  2. Grab URL of file you pushed
  3. Insert the URL into the shortcode within a wordpress post
  4. View post or draft and notice it is 404-not found

Expected behavior The expected behavior is that the jupyter notebook would appear as normally. Weirdly, the same exact short code (with a different URL) loads up fine. This is the short code that loads perfect: [git-github-jupyter url="https://github.com/Data-Indepedent/pandas-everything/blob/master/pandas_functions/Pandas%20Pop%20%7C%20pd.DataFrame.pop().ipynb"]

Screenshots

Additional context The files were just loaded. I thought this was a timing thing, but it's been +8 hrs and still 404

nilsnolde commented 4 years ago

True, that's weird. If you don't want to get involved in WP/PHP dev I'd recommend not going down the road of debugging yourself;) I'll have a look tmrw.

gkamradt commented 4 years ago

@nilsnolde Sounds good thank you. I'd love to hear about how you're thinking about it. For me to learn later.

nilsnolde commented 4 years ago

So, the plugin is calling nbviewer.jupyter.org and extracts the generated HTML which it feeds into your post with some CSS applied. What is failing here is the rendering of your notebook(s) on nbviewer.

Check https://nbviewer.jupyter.org/ and paste your file(s). You'll notice that the only one working from your whole collection is the one you already successfully tried.

It's strange, since those files don't differ significantly. In general, I would really recommend not having file names with any white space, in most environments that's causing trouble. Maybe try removing them and see if nbviewer picks them up.

Feel free to report back if that worked, would be curious too. Closing for now, as it's not an issue with the plugin.

gkamradt commented 4 years ago

@nilsnolde Thanks for following up here. That is weird. nbviewer shows the folder (with all the files) fine, but returns 404 for some of them. https://nbviewer.jupyter.org/github/Data-Indepedent/pandas-everything/tree/master/pandas_functions/

I even have another file in there w/o whitespace in the name https://nbviewer.jupyter.org/github/Data-Indepedent/pandas-everything/blob/master/pandas_functions/Pandas_Duplicated.ipynb

and it does not return. Oh well, at least I have a new path to go down. Thank you!

nilsnolde commented 4 years ago

I'd put an issue on the nbviewer repository. Seems like a weird bug. Best of luck!

gkamradt commented 4 years ago

I went over there and saw plenty of people with the same problem. Most of them were trying to set a cache parameter (flush_cache=true/false) to get a refresh of their code. It didn't work for me.

In addition to removing all of the white spaces from my file names, I removed a dash ("-") from one of my folder names. Not being sure this solved it, my files rendered after that. I also was able to view them on my site.

I set a reminder to myself to post back on here if it doesn't work later, but for now it does. Thanks for the tips.

nilsnolde commented 4 years ago

Ah great! Hope dash wasn't the problem, > 80% of my repositories have a dash :sweat_smile: Thanks for the headsup!

gkamradt commented 4 years ago

@nilsnolde I tracked down the problem and many people are having the same issue. It's dealing with nbviewer's cache and no fix is projected in site.

My files only changed when I switched the "-" in a folder name to a "_" because nbviewer had to pull the file tree once more. I could have changed it to anything and it would have worked.

nilsnolde commented 4 years ago

Haha jeez.. Congrats for tracking it down. Would've caused sleepless nights for me too;) But any clue why they didn't render in the first place? Caching doesn't sound like a culprit for a 404 to me. Or does it somehow cache the full path tree when referencing a file and the failing notebooks were added way later?

gkamradt commented 4 years ago

Yeah, the very first time I loaded the notebook to nbviewer, it already had a single .ipynb file (the one that worked later) in the repo. So it loaded fine.

As I added new files to the repo, those were the ones that weren't loading correctly.

Then I basically reloaded the entire repo (by changing its name from a hyphen to underscore) and then the new files started loading.

But then again, additional files that I've added since then don't work.

Yes, I believe it is your hypothesis, the file tree is cached, and it doesn't pick up the new notebooks. It's wild because I can navigate the folders via nbviewer and see all the new notebooks, but clicking into a one of those new notebooks to view it gives me a 404.

nilsnolde commented 4 years ago

That IS wild! Thanks for sharing. Seems almost inevitable to hit that bug at some point. Thanks to you I’m prepared for it now!

gkamradt commented 4 years ago

Any ideas on a work around?

It's super frustrating having it work off/on with no control. Does anyone else who uses your plugin have the same problem?

I'd pay money per notebook rendered 100% correctly of the time. Not much, >$0

nilsnolde commented 4 years ago

You would need to find an alternative (free) service which renders notebooks in some way, so you can replace nbviewer). I can assist a little with code implementation once you're there (to switch from nbviewer). However, any functionality not belonging to our own core needs (which is purely limited to Gitlab/-hub markdownns) will need to be tended by the community or be put in line to be taken care of by us when we find time.