hexojs / hexo

A fast, simple & powerful blog framework, powered by Node.js.
https://hexo.io
MIT License
39.16k stars 4.81k forks source link

Custom theme causing hexo server and hexo generate to take up to several hours #4309

Open dylanilvento opened 4 years ago

dylanilvento commented 4 years ago

Check List

Please check followings before submitting a new issue.

Actual behavior

I'm using hexo with a custom theme that I've built myself. In the past, generate and server times were manageable, but recently running hexo server can take upwards of 1 to 2 hours of processing time before I can view the site on localhost.

I know there can be several causes for this: number of posts on the site, number of tags used, not implementing the fragment_cache helper (or using the cache argument in partials), switch to a different markdown renderer to improve performance, making sure you're running hexo 4.0 or newer — but addressing these concerns do not seem to alleviate the problem I'm having.

My site has about 300 posts, most of them for one of my two podcasts, which use the hexo-generator-multiple-podcast plugin to generate the corresponding RSS-valid XML file via tags. I have about 60 tags total, with only two of them being used for RSS. I'm unsure if the podcast plugin is also contributing to the processing time. You can see the repo for my dev branch, as well as all of my attempts to improve the situation here: https://github.com/dylanilvento/wardsite/tree/develop

After reading about how hexo uses only one core to run these processes, I recently stumbled upon the fragment_cache helper that helps with processing. Looking up how to implement it took a second, and I'm still not sure if I've coded it properly since some examples use

<%- partial('_partial/partial-example', null, {cache: true}) %>

while others use

<%- partial('_partial/partial-example', {}, {cache: true}) %>

and I'm unsure if these are both valid syntax.

Other references to caching in hexo also mention enabling it in a config file somewhere, but are unclear which config file or what the syntax looks like. I found one example where you should add this to the root _config.yml file:

server:
  cache: true

But that did not seem to fix it either.

I'd appreciate any help on what I may be missing.

Environment & Settings

Node.js & npm version

node: v10.15.3 npm: v6.4.1

Your site _config.yml (Optional)

# Hexo Configuration
## Docs: https://hexo.io/docs/configuration.html
## Source: https://github.com/hexojs/hexo/

# Site
title: Ward
subtitle:
description:
keywords:
author:
language:
timezone: America/New_York

# URL
## If your site is put in a subdirectory, set url as 'http://yoursite.com/child' and root as '/child/'
url: http://ward-games.com
root: /
permalink: :layout/:title/
permalink_defaults:

# Directory
source_dir: source
public_dir: public
tag_dir: tags
archive_dir: archives
category_dir: categories
code_dir: downloads/code
i18n_dir: :lang
skip_render:

# Writing
new_post_name: :title.md # File name of new posts
default_layout: post
titlecase: false # Transform title into titlecase
external_link: true # Open external links in new tab
filename_case: 0
render_drafts: false
post_asset_folder: true
relative_link: false
future: true
highlight:
  enable: true
  line_number: true
  auto_detect: false
  tab_replace:

# Home page setting
# path: Root path for your blogs index page. (default = '')
# per_page: Posts displayed per page. (0 = disable pagination)
# order_by: Posts order. (Order by date descending by default)
index_generator:
  path: ''
  per_page: 10
  order_by: -date

# Category & Tag
default_category: uncategorized
category_map:
tag_map:

tag_generator:
  per_page: 20
  order_by: -date

# Date / Time format
## Hexo uses Moment.js to parse and display date
## You can customize the date format as defined in
## http://momentjs.com/docs/#/displaying/format/
date_format: MMMM Do, YYYY 
time_format: h:mm a

# Pagination
## Set per_page to 0 to disable pagination
per_page: 10
pagination_dir: page

# Extensions
## Plugins: https://hexo.io/plugins/
## Themes: https://hexo.io/themes/
theme: ward

podcasts:
- feed:
    path: wardcast.xml # relative to site root
    title: Wardcast
    subtitle: Missives from the front lines of game development.
    image: img/podcast/wardcast/wardcast-show-image.jpg # relative to config.url + config.root
    non_feed_url: tags/wardcast/ # the non-feed source for this content
    tag: wardcast # see below for tag-based feed
    limit: 500 # how many episodes will be in the feed, max
    content: true # whether to include text content along with the podcast; see note below
    content_encoded: true # same as above, but includes the complete encoded text of the post; see note below
    itunes:
      summary: 'Join Ward Games and friends on the journey of a lifetime: learning how to make games for a living! Listen in as we talk with game dev folks both local and from around the world about game development, gaming news, and more.'
      author: Ward Staff # defaults to config.author
      owner: Ward Games
      email: contact@ward-games.com # e-mail from which iTunes podcast is registered
      category: Leisure # the category from iTunes; make sure to use their values
      subcategory: Video Games # same as above
      explicit: yes # valid values are yes, no, and clean
    media_base_url: http://ward-games.com/wardcast/ # why repeat that in every post?
    default_media_type: audio/mpeg # can be overridden in post
- feed:
    path: attract-mode.xml # relative to site root
    title: 'Attract Mode: A Wardcast Series'
    subtitle: Wherein we watch every video game movie ever made.
    image: img/podcast/attract-mode/attract-mode-show-image.jpg # relative to config.url + config.root
    non_feed_url: tags/attract-mode/ # the non-feed source for this content
    tag: attract-mode # see below for tag-based feed
    limit: 100 # how many episodes will be in the feed, max
    content: true # whether to include text content along with the podcast; see note below
    content_encoded: true # same as above, but includes the complete encoded text of the post; see note below
    itunes:
      summary: 'Join Wardcast compatriots Dylan Ilvento, Nick Nundahl, and Joe Wetmore on their journey to watch every video game film to grace the silver screen — and some that didn’t even make it that far!'
      author: Ward Staff # defaults to config.author
      owner: Ward Games
      email: contact@ward-games.com # e-mail from which iTunes podcast is registered
      category: TV & Film # the category from iTunes; make sure to use their values
      subcategory: Film Reviews # same as above
      explicit: yes # valid values are yes, no, and clean
    media_base_url: http://ward-games.com/attract-mode/ # why repeat that in every post?
    default_media_type: audio/mpeg # can be overridden in post

# Deployment
## Docs: https://hexo.io/docs/deployment.html
deploy:
  type:

server:
  cache: true

Your theme _config.yml (Optional)

(This file is empty.)

Hexo and Plugin version(npm ls --depth 0)

hexo-site@0.0.0 /Users/dylan/Repos/wardsite
├── hexo@4.2.0
├── hexo-generator-archive@0.1.5
├── hexo-generator-category@0.1.3
├── hexo-generator-index@0.2.1
├── hexo-generator-multiple-podcast@0.9.5
├── hexo-generator-tag@0.2.0
├── hexo-renderer-ejs@0.3.1
├── hexo-renderer-markdown-it@4.1.0
├── hexo-renderer-stylus@0.3.3
└── hexo-server@0.3.3

Your package.json package.json

{
  "name": "hexo-site",
  "version": "0.0.0",
  "private": true,
  "hexo": {
    "version": "4.2.0"
  },
  "dependencies": {
    "hexo": "^4.2.0",
    "hexo-generator-archive": "^0.1.5",
    "hexo-generator-category": "^0.1.3",
    "hexo-generator-index": "^0.2.1",
    "hexo-generator-multiple-podcast": "^0.9.5",
    "hexo-generator-tag": "^0.2.0",
    "hexo-renderer-ejs": "^0.3.1",
    "hexo-renderer-markdown-it": "^4.1.0",
    "hexo-renderer-stylus": "^0.3.3",
    "hexo-server": "^0.3.3"
  }
}

Others

SukkaW commented 4 years ago

I'm unsure if these are both valid syntax.

The syntax using null is the invalid one. However, Hexo could handle the null locals so it won't be a problem (But still, not recommended)


Normally 300 posts should only use less than 20 seconds for generation. You can switch the theme back to default hexo-theme-landscape and see if there is any performance improvements. If not, then try uninstalling multiple-podcast plugin.


Also, you could run hexo clean && hexo g --debug and paste the output at https://paste.ubuntu.com . Under debug mode (which will log more details) it will be easier to find out where hexo is stuck at.

dylanilvento commented 4 years ago

I ran a hexo clean && hexo g --debug for part of it. Here's the output.

It seems the generation process slows way down when it hits the podcast art files, which are unique images for each episode that are read into the RSS. They're 2200 x 2200 px in dimension but are only ~3MB at their largest. I'm thinking that maybe burning it into the RSS takes a significant amount of time — each one of these images alone is taking ~10 seconds to process.

I'll try removing these images and seeing that improves the speed.

dylanilvento commented 4 years ago

Alright, after deleting and shrinking some of these larger image files, I tried another generate build, and, similar to the previous one, it really started to hang after a minute or two of processing. I let it run for a bit and the debug prompt finally gave me this:

image

This is the first podcast file it processed, and it took a solid seven minutes to do so. The kicker is that this audio file, clocking in at about 8 minutes and ~10 MB, is probably the smallest audio file on the site. The average size for most of my podcasts is about an hour or two in length and 100 to 200 MB in size. So if the processing time is basically one-to-one to the length of the podcast, having the files in the project when I run a generate command isn't feasible.

This isn't especially deal-breaking since I can probably figure out some way to automate re-adding the audio files whenever I need to do a new build. But I'm really curious why it would take so long to process these files. I was under the impression that the generate command, when it came image and audio files, was simply moving them to a different directory. But if that's not the case, I'm wonder what sort of processing these files would need to be prepared for the site's generation.

Either way, I'll try removing those files to see what it does, but my guess is that it's going to significantly shrink the processing time for the build.

stevenjoezhang commented 4 years ago

See also https://github.com/hexojs/hexo/issues/2579

SukkaW commented 4 years ago

@stevenjoezhang @hexojs/core @dailyrandomphoto

I have read the code recently. It appears that Hexo will process every file under source_dir into a File class. And all current feature we have just won't solve the issue:

We should implement a feature, to only copy the static file (without process & render) into public directly.