jupyter / nbviewer

nbconvert as a web service: Render Jupyter Notebooks as static web pages
https://nbviewer.jupyter.org
Other
2.21k stars 548 forks source link

nbviewer does not flush cache #914

Open shinokada opened 4 years ago

shinokada commented 4 years ago

Following the FAQ, I added ?flush_cache=true at the end of my URL. I tried it, I waited for more than 10min, etc, but none of them worked. The nbviewer shows an outdated version of my notebook.

Any idea why this does not work?

bryevdv commented 4 years ago

Seem to be having a similar issue with Bokeh notebooks. I updated the saved notebooks for a new Bokeh release and also made other small changes. I have confirmed the changes present in the raw files on GitHub. But no amount of ?flush_cache=true makes the new content show up, despite the page stating that it was re-rendered "a few seconds ago" I have also cleared cache, force reloaded, tried private tabs, etc.

Here is a link to a notebook I updated several hours days ago:

https://nbviewer.jupyter.org/github/bokeh/bokeh-notebooks/blob/master/quickstart/quickstart.ipynb

Here is a link to the GH source which shows those changes:

https://github.com/bokeh/bokeh-notebooks/blob/master/quickstart/quickstart.ipynb

bryevdv commented 4 years ago

@minrk do you have any thoughts? Is there something we can try or check on on our end?

ndricca commented 4 years ago

Hi, I am experiencing the same issue on both the ipynb and the html version of the same notebook. Nbviewer seems to keep a cache version corresponding to yesterday's version.

betatim commented 4 years ago
Screenshot 2020-04-01 at 23 14 55

This is what I see in the footer at 23:15, 1 April 2020 Zurich time. It seems like nbviewer thinks it re-rendered the notebook recently but the contents is old. Not sure where this points us but maybe it helps someone.

betatim commented 4 years ago

If i grep for bokeh in the nbviewer server logs while refreshing (with cache disabled) the quickstart notebook I see no log lines. This means (I think) that the response is being served from the CDN.

There is an exception printed in the logs. Seems to happen "every other" request, but not always.

[E 200401 21:26:11 base_events:1615] Exception in callback _HandlerDelegate.execute.<locals>.<lambda>(<Task finishe...visibility'")>) at /usr/local/lib/python3.7/site-packages/tornado/web.py:2333
    handle: <Handle _HandlerDelegate.execute.<locals>.<lambda>(<Task finishe...visibility'")>) at /usr/local/lib/python3.7/site-packages/tornado/web.py:2333>
    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/site-packages/newrelic/common/async_proxy.py", line 91, in send
        return self.__wrapped__.send(value)
    StopIteration

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "/usr/local/lib/python3.7/asyncio/events.py", line 88, in _run
        self._context.run(self._callback, *self._args)
      File "/usr/local/lib/python3.7/site-packages/tornado/web.py", line 2333, in <lambda>
        fut.add_done_callback(lambda f: f.result())
      File "/usr/local/lib/python3.7/site-packages/newrelic/common/async_proxy.py", line 91, in send
        return self.__wrapped__.send(value)
      File "/usr/local/lib/python3.7/site-packages/newrelic/hooks/framework_tornado.py", line 419, in __exit__
        trace_cache().record_event_loop_wait(start_time, time.time())
      File "/usr/local/lib/python3.7/site-packages/newrelic/core/trace_cache.py", line 255, in record_event_loop_wait
        settings = transaction.settings.event_loop_visibility
    AttributeError: 'NoneType' object has no attribute 'event_loop_visibility'

I have no idea really what it is trying to tell us or how it fits into the nbviewer architecture.

krinsman commented 4 years ago

Have you all tried setting flush_cache=false?

I've mentioned this in another thread, but I think the Boolean might be mixed up for the flush_cache query argument.

See here: https://github.com/jupyter/nbviewer/issues/295#issuecomment-591660582

bryevdv commented 4 years ago

@krinsman that has no effect for me either. Page updates to say "Rendered a few seconds ago" but the actual rendered notebook content does not update.

krinsman commented 4 years ago

Is this a notebook rendered on GitHub?

It might be an issue with the GitHub API's cache then (assuming the GitHub API caches things) -- cf. https://github.com/jupyter/nbviewer/issues/912

bryevdv commented 4 years ago

Is this a notebook rendered on GitHub?

I don't understand the question. GitHub tries to render every notebook uploaded anywhere. In any case I link the notebook on GH above and you can see the rendered version GH shows the expected changes.

krinsman commented 4 years ago

GitHub website isn't the same as the GitHub API. Sometimes the API does not immediately pick up changes to the website, to the best of my knowledge.

bryevdv commented 4 years ago

The changes show up in GH raw view, and it's been four days (in the past updating nbviewer was never an issue, and never took more than an hour, much less a day). I think it's a stretch.

krinsman commented 4 years ago

Are we still talking about the notebook mentioned in this https://github.com/jupyter/nbviewer/issues/914#issuecomment-605736733 ?

bryevdv commented 4 years ago

Yes, that is the one I am referring to.


This is the new logo at the bottom that is is on GH (including in the raw source, not just the GH rendered page):

Screen Shot 2020-04-02 at 10 59 30 PM

This is the old content still showing on nbviewer.org that will not update:

Screen Shot 2020-04-02 at 10 59 17 PM
krinsman commented 4 years ago

OK yes, you're right, this is bizarre.

Another reason why it does not seem like it can be a GitHub API issue is that, when one clicks the "Download notebook link" from the NBViewer page, the raw .ipynb JSON one downloads is identical to the one hosted on GitHub -- so the GitHub API is downloading the right JSON, it would just seem that NBViewer is displaying a different JSON for some reason. EDIT: I just realized the above paragraph is stupid since NBViewer doesn't use the API to download the JSON for the JSON button -- it generates the link to the raw.githubusercontent to the URL and then redirects there, so of course they match exactly./EDIT

The only possible reason (at present) I can imagine why this might be happening is because this is one of the notebooks on the frontpage, and perhaps something with the deployment (which I am not familiar with) somehow makes the caching on the frontpage different. (I.e. this issue might not occur for notebooks which are not on the NBViewer frontpage.)

In any case, the frontpage needs to be updated anyway (e.g. the Plotly notebook no longer exists, and the Lightning project is defunct, and there are no examples of notebooks written using R, and only one example using Julia, despite the project being JuPyteR), so optimistically I imagine that, if the frontpage of the website somehow has a separate cache than the rest of the application in the deployment, then if/whenever the frontpage is updated hopefully this issue will go away for the Bokeh notebook.

Ultimately though I have to admit that I am baffled by this and do not know what the issue is.

bryevdv commented 4 years ago

Thanks for looking in to it!

roberto-ceraolo commented 4 years ago

I have the exact same issue. I uploaded a new version on Github and it's been days but the Nbviewer still did not update. I also tried the ?flush_cache=true thing but still nothing happens. Does anyone know why this is happening?

This is the link of my notebook for reference: https://nbviewer.jupyter.org/github/Robbberto/covid/blob/master/Covid.ipynb

roberto-ceraolo commented 4 years ago

As a test, I tried to delete the file from the github repo, and nbviewer is still showing the notebook!

artur-deluca commented 4 years ago

As a test, I tried to delete the file from the github repo, and nbviewer is still showing the notebook!

Same here, I've deleted the rendered file and uploaded an updated version with a different name. nbviewer not only renders the deleted file but it doesn't find the new one.

Update: after waiting a couple of hours, it worked.

aakhmetz commented 4 years ago

I have exactly the same problem as described by @artur-deluca - old version was persistent (flush cash did not work). When I rename my notebook I got the message: "Remote HTTP 404: scripts/XXX.ipynb not found among 16 files" similar to #916

visualisedatadevelopment commented 4 years ago

I am also having the same issue: nbviewer is still displaying old versions even when I append ?flush_cache=true to the url.

marcelfg commented 4 years ago

I am also having the same issue. For now I am manually pointing my links in each update. I'm using the link generated by github to the nbviewer in the .ipynb file page. It is a bad workaround but still a workaround.

Captura de tela de 2020-05-15 19-00-20

minrk commented 4 years ago

Sorry for being offline for a long time. I've looked into this and changed some caching configuration. We have too many caching layers that are semi-opaque (fastly and cloudflare), but I believe this should be fixed. We may see an increase in rate-limit consumption on nbviewer, though.

sreekanthac commented 4 years ago

Hello, this is issue still persists. Any work around available?

NielsKlaver commented 4 years ago

I had the same problem but was able to fix it with cleaning browser cache/history.

onastov commented 4 years ago

I am having the same issue with notebooks hosted on my personal website.

krinsman commented 4 years ago

@onastov You're right, I think there is an error in the code where the behavior intended with flush_cache=false is implemented with flush_cache=true. That's the only way I got my NBViewer extension to work: https://github.com/NERSC/clonenotebooks/search?q=flush_cache&unscoped_q=flush_cache

In order to fix this issue we would have to make a backwards incompatible fix to the bug, which seems to be what people want, since having the opposite behavior is counter-intuitive.

I have not had the cycles to devote to maintaining any software projects recently, being focused on my PhD research. I think this could be a pretty quick fix if/when I find the time -- I think I even posted the link to the specific lines of code which are the problem somewhere in this thread or another thread earlier, so if someone else made a PR fixing it which I could just merge, that would be a lot easier for me at least.

TMKlautau commented 4 years ago

Having the same problem with this notebook

https://nbviewer.jupyter.org/github/TMKlautau/animal_game_development/blob/master/Animal_game_development.ipynb

anyone found another fix? even setting ?flush_cache=false is not working for me

jiffyclub commented 4 years ago

Looks like some form of this is still an issue, I've been waiting for this notebook view to get refreshed for a couple of hours now. No combination of flush_cache flags or browser cache clearing seems to be helping. I get the same result for new browsers and incognito windows.

dealmeidavf commented 4 years ago

Same problem here. Notebook rendered by NBviewer is several days old and doesn’t update with or without the flush_cache flag.

allefeld commented 4 years ago

me too

quickgrid commented 4 years ago

Facing same problem. It is showing old version of notebooks that I first used in, https://nbviewer.jupyter.org/. Also one of my notebooks name had comma in it and it showed 400 error. But after renaming the notebook to keep only alphanumeric chatacters and _ underscore it still shows 400 error.

400 : Bad Request
{'type': 'string'} is not valid under any of the given schemas Failed validating 'oneOf' in execute_result['properties']['data']['patternProperties']['^(?!application/json$)[a-zA-Z0-9]+/[a-zA-Z0-9\\-\\+\\.]+$']: On instance['cells'][39]['outputs'][1]['data']['application/vnd.google.colaboratory.intrinsic+json']: {'type': 'string'}
cbrnr commented 4 years ago

Unfortunately this problem still exists. My notebooks are not updated (e.g. this notebook), sometimes it takes a week until the new versions are rendered. It also seems like this code isn't wrong as some people have indicated:

https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/base.py#L537

Here, False is just the default value if the argument flush_cache does not exist, but the condition is only fulfilled if flush_cache is True. So the problem must be somewhere else.

@minrk @krinsman do you have any additional ideas where to look? I'm afraid nbviewer is unusable in its current state, and it would be a pity if people had to look for alternatives because this is such a nice project!

maskrove commented 4 years ago

Unfortunately this problem still exists. My notebooks are not updated (e.g. this notebook), sometimes it takes a week until the new versions are rendered. It also seems like this code isn't wrong as some people have indicated:

https://github.com/jupyter/nbviewer/blob/master/nbviewer/providers/base.py#L537

Here, False is just the default value if the argument flush_cache does not exist, but the condition is only fulfilled if flush_cache is True. So the problem must be somewhere else.

@minrk @krinsman do you have any additional ideas where to look? I'm afraid nbviewer is unusable in its current state, and it would be a pity if people had to look for alternatives because this is such a nice project!

I have exactly the same problem, with the simplest of notebooks. I have to agree with the sentiment that nbviewer is not usable in this state.

Edit: While this problem persisted for a full day, it has now resolved itself.

cbrnr commented 4 years ago

@maskrove eventually the notebooks will show the latest version, but currently this might happen within an hour or within several days. The problem is that flush_cache=True stopped working at some point, which could be used to immediately force the latest notebook version. So it seems you got lucky 😄!

djinnome commented 3 years ago

It is even weirder for me. Nbviewer was not updating the latest version, but now nbviewer is reverting to an even older version than it was before! What is going on? It would be really nice if that flush_cache option actually worked!

jbcaillau commented 3 years ago

Hi,

Noted the same issue months ago. Seems that the cache problem remains. Pity, given that nbviewer does otherwise a great job. Anyone in charge around?

limegimlet commented 3 years ago

I too am having this problem. While I see my notebook rendered in github here:

https://github.com/limegimlet/covid19/blob/master/dec15_targets.ipynb

I cannot navigate to this notebook on nbviewer, nor does pasting the link above work.

Oddly enough, I was able to view a similar notebook yesterday, but then was unable to view the subsequent versions today. So I created this new notebook in the hopes that at least the initial commit would be visible. Alas no luck!

multinetlab commented 3 years ago

I'm having the same issue here! commited a new version to my repo but nbviewer won't change!

KlukvaMors commented 3 years ago

Im having the same problem:(

meeseeksmachine commented 3 years ago

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/updating-binder-from-changes-on-github/7731/1

bryevdv commented 3 years ago

@meeseeksmachine I don't think that is the same issue. That Discourse post regards live running notebooks on binder. I've never experienced any issue with binder notebooks running or updating. Regardless, this issue is about purely static (not live) notebooks on nbviewer, which is completely separate from binder. They don't have anything in common.

babycamel commented 3 years ago

I have the same problem with nbviewer not regularly refreshing the cache to the latest version on github. in addition the download button on nbviewer no longer works the way it used to.nbviewer seems to have some maintenance issues. The problem is the DOM the underlyiing json on nbviewer is correct. But renders incorrectly.

Fkaule commented 3 years ago

having the same issue, the nbviewer doesn't show the newest version. ?flush_cache=false or ?flush_cache=true is not working either

subwaymatch commented 3 years ago

Same issue here.

MikkoHaavisto commented 3 years ago

I'm seeing different old versions with flush_cache=false and flush_cache=true and neither is the version currently displayed in GitHub.

@marcelfg 's solution above works in most cases, but some notebook files are so big they don't render in GitHub even if they could be displayed in nbviewer. In this case, you can't select the link to nbviewer from github.

empet commented 3 years ago

After a year this issue persists :(

leonheld commented 3 years ago

Still having the same issue. @marcelfg's workaround worked fine for me (using the permalink), although this doesn't seem to be expected behaviour.

DocEpsilon commented 3 years ago

me too

Gkchandora commented 3 years ago

I am also facing the same issues ! 1) Added files to existing gist and nbviewer is not showing new files . 2) Deleted the gist, even though nbviewer is rendering the files of gist.

bjornarfjelldal commented 3 years ago

Can confirm this is stil an issue as well. Would be really glad to see it resolved :)