rstudio / blogdown

Create Blogs and Websites with R Markdown
https://pkgs.rstudio.com/blogdown/
1.73k stars 335 forks source link

Duplicate html files from same rmarkdown #568

Closed llrs closed 3 years ago

llrs commented 3 years ago

I have a problem serving my blog https://github.com/llrs/blogR. Everything seems ok according to check_site() and I build the website:

Building sites … 
                   |  EN   
-------------------+-------
  Pages            |  133  
  Paginator pages  |    8  
  Non-page files   |  209  
  Static files     | 1614  
  Processed images |   11  
  Aliases          |   37  
  Sitemaps         |    1  
  Cleaned          |    0  

Total in 555045 ms

This seems longer than I would expect but lately my computer is slow, so not sure if it is part of the issue or not.

And then try to serve it:

blogdown:::serve_site()
Launching the server via the command:
  hugo server --bind 127.0.0.1 -p 4321 --themesDir themes -t hugo-academic -D -F --navigateToChanged
Error: Failed to launch the preview of the site. This may be a bug of blogdown. You may file a report to https://github.com/rstudio/blogdown/issues with a reproducible example. Thanks!

As I cannot preview it I pushed to github so that Netfly could build it, but it found the following error:

template for shortcode "blogdown/postref" not found

This happens in some *.html files. For instance I have a post on content/post/2019-05-24-fires-in-mexico/index.html with the following line:

<script src="{{< blogdown/postref >}}index_files/header-attrs/header-attrs.js"></script>

I also have a content/post/2019-05-24-fires-in-mexico.html without that line, which imho should be correct as I don't remember using a header on this blogdown site.

Not sure if check_site() should detect this (duplicate builds?)? Or what is causing not to serve the site well. In addition I cannot use blogdown::stop_server():

Warning message:
In blogdown::stop_server() :
  Failed to kill the process(es): 166081. You may need to kill them manually.
xfun::session_info('blogdown')
R version 4.0.1 (2020-06-06)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.1 LTS, RStudio 1.4.904

Locale:
  LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=es_ES.UTF-8        LC_COLLATE=en_US.UTF-8    
  LC_MONETARY=es_ES.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=es_ES.UTF-8       LC_NAME=C                 
  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C       

Package version:
  base64enc_0.1.3 BH_1.72.0.3     blogdown_1.0.1  bookdown_0.21   digest_0.6.27   evaluate_0.14  
  glue_1.4.2      graphics_4.0.1  grDevices_4.0.1 highr_0.8       htmltools_0.5.0 httpuv_1.5.4   
  jsonlite_1.7.2  knitr_1.30      later_1.1.0.1   magrittr_2.0.1  markdown_1.1    methods_4.0.1  
  mime_0.9        promises_1.1.1  R6_2.5.0        Rcpp_1.0.5      rlang_0.4.10    rmarkdown_2.6  
  servr_0.21      stats_4.0.1     stringi_1.5.3   stringr_1.4.0   tinytex_0.28    tools_4.0.1    
  utils_4.0.1     xfun_0.20       yaml_2.2.1     

Hugo version: 0.67.1

Checklist

When filing a bug report, please check the boxes below to confirm that you have provided us with the information we need. Have you:

yihui commented 3 years ago

Do you have the file layouts/blogdown/postref.html (edit: sorry I meant layouts/shortcodes/blogdown/postref.html)? If you do, you need to commit it. If you don't, it may be a blogdown bug.

llrs commented 3 years ago

No, I don't have a layouts/blogdown/postref.html, I have a layouts/shortcodes/blogdown/postref.html,but is not tracked. How should I create that file?

cderv commented 3 years ago

The correct path is the one you have layouts/shortcodes/blogdown/postref.html. This file is required for the website to be build correctly. You need to commit it so that netlify can access it. This should fix the shortcode issue on netlify.

Can you try that and let's see if there are still issue when building / serving the site ?

llrs commented 3 years ago

I couldn't serve the site locally with blogdown::serve_site() but netlify is able to build an publish it.

cderv commented 3 years ago

Does check_site() come up clean ? Can you build the book locally ? (blogdown::build_site()) What is the issue / error message you get with blogdown::serve_site() ?

If you have one, can you retry using :

options(
  xfun.bg_process.verbose = TRUE,
  blogdown.use.processx = FALSE
)

in the .Rprofile ?

This should let some verbosity appear from Hugo background process.

Thank you

llrs commented 3 years ago

Yes, check_site() comes up clean.

I can build with blogdown::build_site():

Building sites … 
                   |  EN   
-------------------+-------
  Pages            |  133  
  Paginator pages  |    8  
  Non-page files   |  200  
  Static files     | 1614  
  Processed images |   11  
  Aliases          |   37  
  Sitemaps         |    1  
  Cleaned          |    0  

Total in 546393 ms

The error I see is:

Launching the server via the command:
  hugo server --bind 127.0.0.1 -p 4321 --themesDir themes -t hugo-academic -D -F --navigateToChanged
Error: Failed to launch the preview of the site. This may be a bug of blogdown. You may file a report to https://github.com/rstudio/blogdown/issues with a reproducible example. Thanks!

With those options I see:

Launching the server via the command:
  hugo server --bind 127.0.0.1 -p 4321 --themesDir themes -t hugo-academic -D -F --navigateToChanged
Building sites … Error: Failed to launch the preview of the site. This may be a bug of blogdown. You may file a report to https://github.com/rstudio/blogdown/issues with a reproducible example. Thanks!
yihui commented 3 years ago

I just realized that I mixed up two error messages. In your case, it seems your Hugo server needs quite a long time to start up, so you may set options(blogdown.server.timeout = 600) before serving the site. That said, I'm surprised that your site needs nearly 10 minutes to build locally.

llrs commented 3 years ago

Yes, to me it is also a surprise, it previously built in less than a minute, and I expected that if I first build then I wouldn't need to wait so long to serve the site. To me it seems like it builds again before serving it. Could this be possible because there are two html files for each post as described on my first message?

I did set up that option but I couldn't find it on the help page of serve_site(), could those options be added to the documentation to the function they affect?

cderv commented 3 years ago

Could this be possible because there are two html files for each post as described on my first message?

Is this expected to you to have two html files per post ? Or is this an issue you encounter using blogdown ?

llrs commented 3 years ago

I just wrote one .Rmd file and expected a single html file per post. I think this is an issue using blogdown.

Sorry I didn't write about the effect of blogdown.server.timeout but haven't been able to let it finish (and won't till after I finish working).

cderv commented 3 years ago

I see now. This was lost in the different topics of your first post.

You have two files:

  1. content/post/2019-05-24-fires-in-mexico/index.html
  2. content/post/2019-05-24-fires-in-mexico.html

This is unintended but I may know where this comes from

I believe the first one is the new one (which contains the new shortcode) - it was created because now blogdown uses Hugo's Pages bundle as a default, which was not the case before. It will then create a folder for each post with an index.html results of index.Rmd and the ressources at the root of this folder.

I believe the second one is your initial one, when Page bundle was not used for your site. in that case the post was postname.Rmd rendered to postname.html.

It seems using the new version of blogdown lead you to end in this weird state... Usually, you can have a mix of bundled and unbundled post with hugo and our change should only affect the new post you create. I am curious how an old post ended up being there twice.

What is the name of Rmd file you are editing and where is it in the folder tree ? Were you using page bundle with Hugo before ? How your post were organized ? Maybe did you run blogdown::bundle_site() in the past ? see ?blogdown::bundle_site()

There is an option to not use Page bundle at all if you wish but as it seems you are in a in-between state, that is not useful now.

Also @yihui, maybe we could check for such duplicate posts ? We need to understand further what happened though.

llrs commented 3 years ago

Apologies, I tried to provide all the relevant information of the state of my website and build process and didn't identify the multiple issues I have on the website.

The file of Rmd is on the bundle content/post/2019-05-24-fires-in-mexico/index.Rmd. Previously I wasn't using bundles, I think I switched on August using blogdown::bundle_site() and blogdown::serve_site() kept serving the site locally fine and fast.

I would like to keep using bundles. So if I understand correctly I can safely remove all the content/post/*.hml but not those on any subfolder on content/post/*/*.html.

cderv commented 3 years ago

Yes, if you are using bundles, the content/post/*.html should not have an associated .Rmd file, and should not have been updated since you switch in August. However, hugo see those file and it may create some weird conflict.

You should have only one html per rmd file - and if you are using bundle, index.Rmd should lead to index.html only, with post name in the folder.

You can try remove those duplicated html you know are no more useful and see if it changes things for you.

yihui commented 3 years ago

This is a bug of bundle_site() and I'll fix it soon. Thanks for the report!

yihui commented 3 years ago

This issue has revealed four problems in total, and all have been fixed in the current dev version. With

remotes::install_github('rstudio/blogdown')

you can run blogdown::check_site(), and it will tell you to clean up the duplicate .html files.

llrs commented 3 years ago

Again, many thanks for the assistance @yihui and @cderv ! To close the loop (not sure if it is a bug or where should it be fixed) I think that I found what was slowing the build process. There were many ~1000 csv files on a static/ folder (that wasn't committed or pushed; so netfly didn't saw them). Deleting that folder made it possible to build it with a reasonable time 6967 ms and serve it.

yihui commented 3 years ago

Good to know! Thanks for explaining the mystery! (I tried your repo but wasn't able to reproduce the slowness, and now I know why)

malcolmbarrett commented 3 years ago

I had a related issue where I was getting the netlify error template for shortcode "blogdown/postref" not found and blogdown::check_site() was telling me to add to git that shortcode to git. I'm sharing my solution here in case anyone else has trouble and stumbles on this post.

What was happening in my case was the I had blogdown/ in my .gitignore (not sure if this is recommended since it is not suggested in check_site()). layouts/shortcodes/blogdown/ was getting git ignored as a consequence. To fix this, I had to target just the top-level folder by changing it to /blogdown/

cderv commented 3 years ago

Thanks @malcolmbarrett !

I think it could be a good idea to add this one in check_gitignore : If blogdown/ is found advice to change it to /blogdown/* as I think this folder was indeed something put in gitignore file in blogdown websites.

yihui commented 3 years ago

@cderv Good idea indeed. Please feel free to make the change in the master branch. I feel this should be straightforward to implement. If you are not sure, sending a PR is also good! Thanks!

cderv commented 3 years ago

@yihui I have done it. I just wonder if we should also use /dir/ for public or ressources.

In gitignore pattern the / are used to indicate no recursivity, so a path directly relative to the .gitignore. This is why blogdown alone was too much: all path with blogdown somewhere where ignored.

What do you think ?

llrs commented 3 years ago

I suggest to do it. I initially had set up that way (public/, ...) but following blogdown::check_site() I removed the trailing slash as it wouldn't stop complaining...

cderv commented 3 years ago

As @llrs confirmed it as a potential annoying case, I also pushed a change for it.