rstudio / blogdown

Create Blogs and Websites with R Markdown
https://pkgs.rstudio.com/blogdown/
1.73k stars 335 forks source link

Website in Viewer is updated before Rmd rendering #486

Closed cderv closed 4 years ago

cderv commented 4 years ago

I believe with blogdown.knit.on_save activated, the viewer is reloaded before the Rmd is rerendered with the change. At least, in most of the time and on Windows.

This cause the viewer to not show the last change but the previous one.

Steps when this happens

See #481 for full workflow

Maybe related to --navigateToChanged not working ?

Let's note that it works ok if the post is opened in the browser

There is something not stable in this I believe. I'll look into it based on these notes.

cderv commented 4 years ago

Out of curiosity I tried on Ubuntu with Rstudio Server and the website does do not update...

having the echo of stout hugo process helps see the website is correctly updated.

Maybe an issue with newer version of hugo ?

cderv commented 4 years ago

Maybe an issue with newer version of hugo ?

So live reload is working well with last hugo version when I test without R

That means LiveReload is working well and --NavigateToChanged also.

This is just not working correctly with Rmd post, and I believe it is because of the issue you mentioned : All the files are watched by the hugo server (including intermediates files) and this is surely causing issue to reload to the correct file.

IMO this impact the experience while working in RStudio

yihui commented 4 years ago

I think this is mostly because build_rmds() moves files around in the project, which confuses Hugo and it fails to infer which page it should navigate to.

Using page bundles will make it much better. I was working on it yesterday, but didn't manage to finish it.

Another thing is that we have to ignore .knit.md and .utf8.md (e.g., https://github.com/yihui/hugo-xmin/commit/f8510d84b0ebb6c3210e1da2237a313ef81b53c8), otherwise Hugo may also try to render them. We'll need to document this.

The bug https://github.com/gohugoio/hugo/issues/3811 that I mentioned may be irrelevant. It will be great if it could be fixed, but I don't think it's the culprit for the failure of --navigateToChanged.

cderv commented 4 years ago

I think this is mostly because build_rmds() moves files around in the project, which confuses Hugo and it fails to infer which page it should navigate to.

This is indeed what I observe if I run the server hugo server in a Terminal and read what is printed as output when I modify the files. Lots of file watched and removes and written.

Another thing is that we have to ignore .knit.md and .utf8.md (e.g., yihui/hugo-xmin@f8510d8), otherwise Hugo may also try to render them. We'll need to document this.

Those are intermediary files that are temporary during the Rmd building right ? Is it a bad idea to set intermediate_dir in render() to a temp directory ? Or one way to "trick" hugo would be to render everything in another folder and move back to where hugo is watching only when rendered. You may already have thought about that so again wrong idea ? 😅

It will be great if it could be fixed, but I don't think it's the culprit for the failure of --navigateToChanged.

I was under the impression that, as all the files are watched included the ignored file, --navigateToChanged won't know which one to navigate, and if those file were correctly ignored, that would leave us with the html file only. 🤷‍♂️

yihui commented 4 years ago

Those are intermediary files that are temporary during the Rmd building right ? Is it a bad idea to set intermediate_dir in render() to a temp directory ?

I'm not very comfortable with using intermediate_dir. You probably know that we have had a lot of bug reports related to this argument in the rmarkdown repo. On the other hand, ignoring .knit.md and .utf8.md is an easy task for users. On a related note, we really should retire the intermediate .knit.md and .utf8.md files in the rmarkdown package someday. They are no longer necessary now because we only support UTF-8, so for foo.Rmd, foo.knit.md and foo.utf8.md are identical to foo.md.

Or one way to "trick" hugo would be to render everything in another folder and move back to where hugo is watching only when rendered. You may already have thought about that so again wrong idea ?

The challenge is that you can't easily move an Rmd file away and render it elsewhere when it references external files (e.g., read.csv('../data/foo.csv')). Moving the Rmd file is probably less robust than using intermediate_dir.


As I said, with page bundles, this can be much easier. The _files and _cache directories don't need to be moved. The main decision that I'm still hesitating on is what to do with HTML dependencies. @apreshill suggested in #476 that I just leave them alone in the _files folder instead of moving them to static/ (of course, this requires the assumption of using page bundles), which is extremely easy for me (things can't be easier if you are told not to do anything), but I've been hesitating because there is a downside of this way, i.e. we may have a lot of duplicate CSS/JS libraries in a site. Some HTML dependencies are large and complex, such as those from DT and leaflet, and I don't feel very comfortable duplicating them for each post. This was the motivation behind moving HTML dependencies to static/rmarkdown-libs/.

Perhaps I should just provide an option and let users decide if they want to move HTML dependencies to static/. If they want to move, this will be another source of change for Hugo's livereload, which may also confuse Hugo's navigation (--navigateToChanged).

I'll keep experimenting. I probably can't make the auto navigation work perfectly for the coming CRAN release. One thing is for sure: the experience of writing pure Markdown post will be much much better. Knitting individual posts used to be a totally wrong action for users to do, and now at least pressing the Knit button won't hurt any more. I guess that's already a huge step forward, despite the annoyance of excessive auto-refreshing and lack of support for auto-navigation.

yihui commented 4 years ago

I ended up using a dirty trick 04ef6ee to let Hugo know for sure which page it should navigate to. The trick was to wait for 2 seconds after building the Rmd file, and rewrite it one more time. It seems to work for both page bundles and non-bundles.

There is still unnecessary refreshing. I'll see if I can fix it tomorrow.

cderv commented 4 years ago

Thanks for the detailed explanation above. It makes perfect sense.

Re. the dirty trick, this is clever !

Just tested it:

It seems to work well on Windows once you first navigated on the page. When opening the project, it will open the home page, the modifying and saving will trigger the render but it won't navigate to the post page as it does for .md file. However, once you are on the post page, it will live reload ok now! That is a lot better experience !

I also tested on RStudio server and this does not work there. I think it is specific to the server architecture. Opening devtools console I got a loading error image

livereload.js:1 Failed to load resource: the server responded with a status of 404 (Not Found)

I believe the js file from the local hugo installation is not found or exposed correctly on Rstudio server situation. This is a limitation we should document if we don't find a workaround. I'll try something though...

cderv commented 4 years ago

I'll try something though...

Ok it is not working as I thought. I wonder if some parameters should pass to hugo server when used on RStudio server

cderv commented 4 years ago

And I think this is because RStudio server act as a proxy. There is even a function in rstudioapi to help with that. For example, if a hugo server is served on port 6380

> rstudioapi::translateLocalUrl("https://localhost:6380/livereload.js")
[1] "p/56a946ed/livereload.js"

Currently, as the error above it tries to load http://localhost:8787/livereload.js but it should be http://localhost:8787/p/56a946ed/livereload.js

We may need to configure hugo server to know this ... 🤔

yihui commented 4 years ago

I have this trick for RStudio Server: https://github.com/rstudio/blogdown/blob/ba977a94279e39f175b65dac6ec5288bb963c1df/R/utils.R#L656-L663 but it can't fix the problem of the path of livereload.js, because it's not controlled by the configuration relativeurls. I have also tried to use rstudioapi::translateLocalUrl() to change the baseURL temporarily before, but I don't remember why it didn't work now. I'll experiment one more time.

cderv commented 4 years ago

I have also tried to use rstudioapi::translateLocalUrl() to change the baseURL temporarily before

Yep I am trying that but with no sucess for now.

yihui commented 4 years ago

This is probably one of the reasons why I didn't use hugo server as the default server. The server started with servr doesn't suffer from this issue, because I have control over the JavaScript.

cderv commented 4 years ago

Yes the live reloading would work ok in that case.

Having RStudio using a location.pathname as proxy to a another port on localhost won't play well with livereload of hugo it seems. Even if we manage to make the file livereload.js to be found, I am not sure the logic for the reloading will work ok in this case. but that is a guess.

It is too bad the relative url trick does not work for this js resource but does for other css and jss of the website. Because the preview is working well.

I don't know if many people are using blogdown on rstudio server (or RStudio cloud)

yihui commented 4 years ago

We could file a bug report to Hugo, but I don't feel hopeful that it could be fixed (especially if we don't actually submit a PR to fix it), because this is indeed a rare use case.

I don't know if many people are using blogdown on rstudio server (or RStudio cloud)

124 has some examples. That was the motivation behind the trick of setting relativeurls = true on RStudio Server.

cderv commented 4 years ago

yes it is not really cool this does not work well on Rstudio server. But I guess hugo itself is not working well.

BTW, changing the base url does not seems to affect where the livereload.js is fetched. The requested url in the Browser Network pane still show a request URL without any change to the base url. No pathname even if I added one.

I think this is related to https://github.com/gohugoio/hugo/issues/1187

yihui commented 4 years ago

Actually I've been thinking for long about the possibility of connecting to the websocket from R. If we have access to the data that Hugo and the web browser send to each other, we could probably do something (like refreshing a page or auto-navigation) on our end. But I guess in this case, the websocket is not even successfully established in the first place... Another crazy idea is that we could open a browser in a background thread in R to process the request http://localhost:8787/p/56a946ed/livereload.js.

cderv commented 4 years ago

I believe the issue may come from here https://github.com/gohugoio/hugo/blob/49972d07925604fea45afe1ace7b5dcc6efc30bf/transform/livereloadinject/livereloadinject.go#L62 Where they inject into the HTML files a script tag using in src attributes "/livereload.js" and this does not include the /p/56a946ed path However it is a relative url right ? Should it work or not as it is ?

cderv commented 4 years ago

The idea I want to experiment is to see if we don't serve the website from memory but we write to a folder, we would have access to the html file and we could change the script tag maybe to correctly load the js script.

I am experimenting with this now.

yihui commented 4 years ago

Where they inject into the HTML files a script tag using in src attributes "/livereload.js" and this does not include the /p/56a946ed path However it is a relative url right ? Should it work or not as it is ?

That's an absolute URL relative to the domain root. It's not a completely relative URL. See the second bullet in the last list: https://bookdown.org/yihui/blogdown/html.html This /livereload.js needs to be made truly relative. It shouldn't start with /, but either no / in the beginning, or a path like ../../ in the beginning. In other words, the relative URL should be relative to the current page, not the website root.

Yet another idea is to modify

<script src="/livereload.js?port=4321&amp;mindelay=10&amp;v=2" data-no-instant defer></script>

to

<script src="/p/56a946ed/livereload.js?port=4321&amp;mindelay=10&amp;v=2" data-no-instant defer></script>

on the page using JavaScript (document.getElementsByTagName('script')[0].src = ...). But I'm afraid we can't get this to work automatically, and users will have to add the JS code in their template.

The idea I want to experiment is to see if we don't serve the website from memory but we write to a folder, we would have access to the html file and we could change the script tag maybe to correctly load the js script.

I guess this won't work, because Hugo dynamically injects the <script> when serving the HTML files. It is probably not included in the static HTML files.

cderv commented 4 years ago

I guess this won't work, because Hugo dynamically injects the Githubissues.

  • Githubissues is a development platform for aggregating issues.