Magickbase / nervos-official-website

nervos.org
https://nervos-official-website.vercel.app/
3 stars 16 forks source link

Reduce size of static files #297

Open Keith-CY opened 1 year ago

Keith-CY commented 1 year ago

This issue comes from the error at https://github.com/NervosEducationHub/EducationHubArticles/actions/runs/5763474768/job/15625346948#step:6:19

Error: The Serverless Function "en/home" is 50.26mb which exceeds the maximum size limit of 50mb.

It happened because the whole public folder(https://github.com/Magickbase/nervos-official-website/tree/develop/public) was uploaded to the serverless function. There's a sub-module named education_hub_articles pointing to https://github.com/NervosEducationHub/, which includes all raw files of markdown and images to generate article pages of the knowledge-base.

All the markdown files and images can be removed after article pages are generated. However, the homepage of knowledge-base is server-rendered, so it fetches these raw files in the serverless function.

Before this hack(https://github.com/Magickbase/nervos-official-website/commit/5668c0d303eb59d8ba3249d5fd03da0d271052b6), all the markdown files are uploaded to the server less function for the server-rendered page. This hack removed images from the serverless function, and reused cover images from the statically generated article pages(https://github.com/Magickbase/nervos-official-website/commit/5668c0d303eb59d8ba3249d5fd03da0d271052b6#diff-a8b84b632042488f87d04ecb12cd92cca82964a9583b052b715850cffcc1c5a5R30)

It should be fixed by a proper way:

  1. set cover with their github url
  2. truncate markdown files to shrink their size, only metadata of each article is used in the homepage of knowledge-base
  3. etc.

If @zhangyouxin @WhiteMinds have any better idea, please share in this issue

Keith-CY commented 1 year ago

This patch introduced a problem that

So the thumbnail points to an image on the article page. But I didn't know that images on the article page were not hosted until the article page was visited/scraped. It may be a strategy to save resource of the platform that images on the page are optimized/hosted on demand.

So the thumbnail will be empty until the article page is visited/scraped.

It can be fixed by the method mentioned above

set cover with their github URL

The drawback is that thumbnails are raw images and not optimized. We may need a better solution, or adopt this one and optimize it later.

Keith-CY commented 1 year ago

An update to set thumbnails to raw images on github was proposed at https://github.com/Magickbase/nervos-official-website/pull/317, but we can have a talk to find a better solution

zhangyouxin commented 1 year ago

Sounds like that a image hosting service could be a solution? But that requires repalcing all existing images and it's not convenient to create new blog

Keith-CY commented 1 year ago

Sounds like that a image hosting service could be a solution? But that requires repalcing all existing images and it's not convenient to create new blog

Yes, using an external host does work but requires extra actions to sync them to the host.

In fact, the GitHub was used as the image host/CDN. In the education repo, all images are submitted to the repo so they can be visited by github url https://raw.githubusercontent.com/NervosEducationHub/EducationHubArticles/main/nervosdao_withdrawal_process_explained/images/image1.png

However, GitHub doesn't have image optimization(AFAIK), so the homepage of knowledge base loads a lot of data because all covers are raw images. It works, but not ideal

zhangyouxin commented 1 year ago

I've visited the preview page in https://github.com/Magickbase/nervos-official-website/pull/317, it's not that fast to load cover images, but I think it's a good patch if it meets the requirements for now.

Keith-CY commented 1 year ago

I've visited the preview page in #317, it's not that fast to load cover images, but I think it's a good patch if it meets the requirements for now.

I have just compared the data size, the one patched loaded 25.2MB images while the original one loaded 626kB. I think loading 25MB on visiting a page is a bit crazy 😱

WhiteMinds commented 1 year ago

I tried moving education_hub_articles to knowledge-base and selectively importing images for each blog page, but it still doesn't seem to work. I suspect it's because knowledge-base/index depends on all the images and calculates them based on their original size.

Here is my testing process:

mv public\education_hub_articles src\pages\knowledge-base\education_hub_articles
// change blog.ts
const blogsRootDirectory = join(process.cwd(), 'src', 'pages', 'knowledge-base', 'education_hub_articles')
const eduImages = require.context('../pages/knowledge-base/education_hub_articles/', true, /\.png$/)
// change getBlogBySlug
coverImage = eduImages(`./${slug}/${coverImageURL}`).default
coverImage.fullPath = `${prefix}${coverImage.src}`
image
WhiteMinds commented 1 year ago

If we need a standardized solution, I think it would be better to move education_hub_articles out of public and automatically update it to a CDN supported by nextjs images (https://nextjs.org/docs/pages/building-your-application/optimizing/images#remote-images) during deployment.

Keith-CY commented 1 year ago

I tried moving education_hub_articles to knowledge-base and selectively importing images for each blog page, but it still doesn't seem to work. I suspect it's because knowledge-base/index depends on all the images and calculates them based on their original size.

Here is my testing process:

mv public\education_hub_articles src\pages\knowledge-base\education_hub_articles
// change blog.ts
const blogsRootDirectory = join(process.cwd(), 'src', 'pages', 'knowledge-base', 'education_hub_articles')
const eduImages = require.context('../pages/knowledge-base/education_hub_articles/', true, /\.png$/)
// change getBlogBySlug
coverImage = eduImages(`./${slug}/${coverImageURL}`).default
coverImage.fullPath = `${prefix}${coverImage.src}`
image

/knowledge-base/index does depend on all cover images because it's server-rendered. It tries to fetch the cover image to optimize every time a request arrives. So every cover should be accessible in the folder of static files, or can be fetched externally.

Another solution is to generate /knowledge-base/index statically. Then the static paths should be iterated from page 1 to page end, from category all to category x.

It generates hundreds pages due to the combination

Keith-CY commented 1 year ago

If we need a standardized solution, I think it would be better to move education_hub_articles out of public and automatically update it to a CDN supported by nextjs images (nextjs.org/docs/pages/building-your-application/optimizing/images#remote-images) during deployment.

I think it's the only correct solution. Here's the list of image hosts to use https://nextjs.org/docs/app/api-reference/next-config-js/images#example-loader-configuration

Keith-CY commented 1 year ago

I've merged the patch first because there're articles updated daily.

An improvement was added to the patch that the raw images on GitHub will be fetched as a fallback to the optimized images on vercel.

That means, most thumbnails on the /knowledge-base page are fetched from vercel because most articles are visited before. Only the thumbnails of latest updated articles are not hosted on vercel, and will be fetched from github directly.

It keeps everything working while reduces the data loaded from github.

Keith-CY commented 1 year ago

The server-rendered page /knowledge-base can be refactored into statically rendered by generateStaticParams

Ref: