Closed favoyang closed 3 years ago
Hi @favoyang
We may still need some discussions before we make it public. FLY, @meteorlxy is working on https://github.com/vuepress/vuepress-next
Thanks for the link. I get the first impression that it gonna focus on Vue 3 and the typescript support.
I assume you notice these scalable issues, and perhaps @meteorlxy can see this thread.
This is basically a broad topic. Do you prefer I close this issue, or leave it open for other developers to share their wishes on the next major release?
I get the first impression that it gonna focus on Vue 3 and the typescript support.
In fact, it's a completely refactored new version.
As for the scalablility, we may need some real-world large projects for testing. It would be nice if you guys can provide one @favoyang @Mister-Hope @itsxallwater @xbill82 @JimmyVV
Absolutely!
@itsxallwater If it's OK to open source your repo, you can simply paste the link here. If not, you can invite me to a private repo
BTW, I want to share the progress of vuepress-next:
Core features have been finished about 90%. The 10% left are some optional features that to be determined.
Now users can try to use vuepress-next with their own theme.
The remaining two big goals (help wanted @Community & core team :smile:):
@itsxallwater If it's OK to open source your repo, you can simply paste the link here. If not, you can invite me to a private repo
It's open, by all means! Please don't hesitate to let us know if there's anything we can do to help test.
It's open, by all means! Please don't hesitate to let us know if there's anything we can do to help test.
This is a good candidate. It contains 2000+ static pages.
My open-source project openupm contains 1300+ pages. But most pages are generated via additionalPages, the API seems changed for vuepress-next. It could be cost more time to migrate. So for a benchmark purpose, I think @itsxallwater's project is clean and better. But I'd like to share some information.
What trouble me most is the memory usage, as it grows vuepress requres more and more memory to build. It's now required 6G. The GitHub action build bot has a 7G limitation, it's closed. I used the additionalPages feature, so I also need to verify if my generator code is not friendly for GC. I also worried that leveraging multiple cores to speed up the build https://github.com/vuejs/vuepress/issues/1560 may make the issue worse.
The second is that the siteData.js bundle is getter bigger. I guess vuepress needs to know all pages (path, title, heads) in advance for router and search. But is that mean that all frontmatter also needs to be packed into the siteData.js as well? I'm not entire sure about this, but if you check the $site.pages
it contains all frontmatter info. If you're using the vuepress-plugin-seo plugin, it also contributes some metadata to the frontmatter.
[BABEL] Note: The code generator has deoptimised the styling of /home/favo/projects/openupm/.temp/internal/siteData.js as it exceeds the max of 500KB.
One example page with a fat frontmatter stored in $site.pages
.
{
"title": "Packages - GUI",
"frontmatter": {
"layout": "PackageList",
"showFooter": false,
"noGlobalSocialShare": true,
"title": "Packages - GUI",
"topics": [
... <it's really long anyway>
]
},
"regularPath": "/packages/topics/gui/",
"key": "v-51883712",
"path": "/packages/topics/gui/",
"content": ""
}
What trouble me most is the memory usage, as it grows vuepress requres more and more memory to build.
When we are trying to bundle a huge web app, I'm afraid that we have to load all the files into memory.
The second is that the siteData.js bundle is getter bigger. I guess vuepress needs to know all pages (path, title, heads) in advance for router and search.
Yes, $site.pages
is a problem.
It's required for the built-in search box, and may be useful for blog users who want to generate a index page to list all of their posts.
However, for a documentation site that uses algolia search box, it's useless to load all pages data.
In current vuepress-next, pagesData
is extracted from siteData
. But it's mainly for hot reload purpose, and we still have to load all pages data.
VitePress drops the built-in search feature, and injects page data into its own component to avoid this.
It might be a good choice to migrate to vitepress for large scale docs site. :thinking:
Or, we should also drop the built-in search to get rid of the limitation
Hey @meteorlxy thanks for your work!
When we are trying to bundle a huge web app, I'm afraid that we have to load all the files into memory.
Can you develop this point, please? IMHO the memleak surges when generating the static HTML pages, which can be done (I might be naive) by reading the .md files one by one from the file-system.
As for the scalablility, we may need some real-world large projects for testing.
All our docs at @kuzzleio are open-source, but the MD files are scattered across repos. We maintain a repo for the Vuepress code and use our own CLI to gather the MD files and build them against our Vuepress code.
Would you like me to share with you the necessary steps to do it? It's three or four commands.
@xbill82
Can you develop this point, please? IMHO the memleak surges when generating the static HTML pages, which can be done (I might be naive) by reading the .md files one by one from the file-system.
Memory leakage is out of scope. We are always using more memory when the project grows up.
However, the memory usage of current Vuepress 1.x is abnormal. There might be memory leakage but it's not easy to figure it out.
The whole process (including the SSR of Vue 3) of vuepress-next is different from vuepress 1.x, which is hopefully to solve these problems. But it has not been verified, that's why I'm asking you guys to provide your repo :wink:
I will start to test vuepress-next on large scale projects when the skeleton of default theme is ready.
Ok, let's put our hope in the next release. I'll prepare you a gist with the commands to test it with our (huge) documentation.
Here it goes @meteorlxy https://gist.github.com/xbill82/dc81f7d025533014f210ef32b47e9b80 have fun and feel free to ask for help on the Kuzzle Discord, you can mention me @luca.m
Thanks for the link to vite and vitepress. Interesting concept but definitely costly to migrate to a unmature framework, not to mention the vuepress plugins I used...
However, the memory usage of current Vuepress 1.x is abnormal. There might be memory leakage but it's not easy to figure it out.
I agree that we should re-analyze the memory footprint, build time with vuepress-next when it's ready. It's a big refactor anyway.
@Mister-Hope has some discoveries on https://github.com/vuejs/vuepress/issues/2656#issuecomment-715339746, here I quote as below. Maybe it's helpful to figure out potential anit-GC practices. I'm totally okay that load all page data into memory, but 2048 pages with 100k memory footprint each is only cost 200MB. Vuepress 1 is requesting much more. There could be something very obviously stop the GC to do the job.
Mister-Hope: Vuepress is bad with build process. From the source code, it will generate a lot of shallow copy with frontmatter, page object(including slug, frontmatter, headings and some other info) and even siteData copy. The build process is using a newer copy of these objects while referencing some parts of the old ones, so the old ones will move to "old space" instead of being gc off.
It's required for the built-in search box, and may be useful for blog users who want to generate a index page to list all of their posts. However, for a documentation site that uses algolia search box, it's useless to load all pages data. In current vuepress-next, pagesData is extracted from siteData. But it's mainly for hot reload purpose, and we still have to load all pages data.
For the search feature, you may want to build indexes - just enough data for users to process a search. That means title, headlines, tags, the page ID, and URL. Packing these compact data into the main js is fine. If it's only useful for the search feature, the search plugin can collect it on it's own, not the system level. Then vuepress can do a dynamic load to get other parts of a page - front-matters and the content for an individual page request.
For generating a (paginated) index page for a blog, it happens on the build stage you can do whatever you want. E.g. get 10 most recent articles and save that into the home page front-matters. But you don't need to give them the whole website data just for letting it do the filtering on the browser.
What I'm trying to argue is that packing everything and let users download it on the first load is very hard for bigger projects to scale. Maybe it's okay for hundreds of pages but thinks about 10k pages. It's a bad not scaleable design decision.
Renamed title to "Vuepress-next scalability" to match with the actual discussion.
https://github.com/Mister-Hope/Mister-Hope.github.io
My blog, happy to test it as soon as vuepress-next is release with a version.
Currently 7GB with my theme and 5GB with theme-default using vuepressV1
What I'm trying to argue is that packing everything and let users download it on the first load is very hard for bigger projects to scale. Maybe it's okay for hundreds of pages but thinks about 10k pages. It's a bad not scaleable design decision.
That's exactly what I wanna say, when the page grows, there will be hundreds of js link(with long hash file name) in the head tag of generated HTML.
For my blog, it's 30MB js and 70MB HTML, each html has an average size around 90KB(excluding some special ones), while 50KB of them are js link tag in head. (That is taking 30MB size for my 660 pages)The js link size is nearly the same as the actual content size.
I think that's another problem we should be careful with V2, I cannot imagine the length of the head tag with a site containing thousands of pages.
While, since the vuepress is actually working under spa, I am afraid we can not drop js files unless we use another way to do the ssr. I am expecting a group chunk of pages in same folder can be generated when dectecting large amount of pages, or users can be able to configure it.
@favoyang
If it's only useful for the search feature, the search plugin can collect it on it's own, not the system level.
Yes that's exactly what I'm thinking about: let search plugin and blog plugin to collect data themselves.
@Mister-Hope
there will be hundreds of js link(with long hash file name) in the head tag of generated HTML
Thanks for this point. I think what you mentioned is the prefetch links?
In fact, if you set shouldPrefetch: () => false
, the links will not be rendered at all. But it's not so perfect, because the renderer will still map the files and test it with shouldPrefetch
one by one (which will slow down the ssr process, too).
We might allow shouldPrefetch: false
to disable prefetch links totally.
------ updated
Now vuepress-next has implemented those features:
shouldPrefetch: false
Yes, the prefetch link is taking n × n space when the pages grow.
@Mister-Hope Quick response. I updated my last comment and it should work in vuepress 1.x
@meteorlxy thank you for working on VuePress 2, look forward to using it! I can test on workshops.frontendfoxes.org and other sites, I'll give it a try
@meteorlxy I've done a preliminary build of our public docs using vuepress-next and generally speaking, it seems to work! Neato! 👍👍
This is without ejecting the theme and without bothering with any plugins, but the build and render is working.
Now for the bad news ☹️:
README.md
along with any image assets that should be referenced in the article into the directory. These images are then referenced in the readme a la [image description](./image.png)
which does not appear to be working in this version. Example:ERROR in ./docs/.vuepress/.temp/pages/jbase/faq/backups-using-veeam/windows-restore/README.vue?vue&type=template&id=eefd9c2c (./node_modules/vue-loader/dist/templateLoader.js??ref--5!./node_modules/cache-loader/dist/cjs.js??ref--0-0!./node_modules/vue-loader/dist??ref--0-1!./docs/.vuepress/.temp/pages/jbase/faq/backups-using-veeam/windows-restore/README.vue?vue&type=template&id=eefd9c2c)
Module not found: Error: Can't resolve './windows_restore_9.png' in '/home/mikew/src/Internal/vuepress-next/docs/.vuepress/.temp/pages/jbase/faq/backups-using-veeam/windows-restore'
@itsxallwater Thanks!
Seems that your build already failed, but our cli didn't terminate the process, so you thought it was stuck. It‘s a point to enhance, but has nothing to do with scalability
Nice catch and will be fixed soon
Let's move to vuepress-next repo for further discussion and bug report
--- update
I've tested on your project and it only costs less than 4 minutes to build your site :wink: @itsxallwater
I've put the results to the related issue in vuepress-next repo. see https://github.com/vuepress/vuepress-next/issues/8
When you own a blog who have 200k pages, you will rather die instead of keep hosting it. Expecially use Vuepress 2.x . router.js and blogData.js is too large(more than 40MB)...
已经收到哦~~~~
Hi there,
I'd like to know if there's a roadmap for the next major update, Vuepress 2? The 1.0 release cycle is awesome lead by @ulivz. Now the project seems maintained by @billyyyyy3320, @bencodezen, with minimal fixes. But it's not clear who is actually leads the next major update.
Personally, I would like to see the improvements to scale vuepress for bigger projects, the ones with many thousands of pages.
1560 make the build faster
2656, #2448, #1819 limit memory usage to build progress. Currently, it scales linearly with the number of pages.
Of course, people from different backgrounds may have different priorities, like support vuejs 3 for example.