ZORALab / Hestia

One Peaceful Frontend+Backend Software Library Suite.
https://hestia.zoralab.com
Other
19 stars 1 forks source link

Upgrade Sitemap.xml Algorithm to Depth Discovery #4

Closed hollowaykeanho closed 1 year ago

hollowaykeanho commented 2 years ago

Description

Please provide a short description of what feature you're looking forward to
enhance below. Please include the story behind your idea as well to give a
better visualization of your idea.

Since there is a hard limit of max 50MB (uncompressed) total size, max 50,000 URLs, and there is no easy way to test this max limit, it's better we use algorithmic approach to perform sitemap discovery. See: https://developers.google.com/search/docs/advanced/sitemaps/build-sitemap

Secondly, some search engine allows image mapping. See: https://developers.google.com/search/docs/advanced/sitemaps/image-sitemaps

One way is to set each page renders its own sitemap.xml, then the parent one uses the <sitemap> tag to discover the children pages. See: https://developers.google.com/search/docs/advanced/sitemaps/large-sitemaps

With such, it reduces the burden of developing complex sitemap algorithm just in case any user manage to break that maximum limit record with Hestia.

Expected Behavior

Please specify the expected behavior of your requested enhancement. Some great
and helpful pointers are your expected interface (e.g. command patterns,
simple sketches of the user interface, etc).

Self-discovery sitemap system is developed for future hestia version.

Current Behavior

Please specify the current behavior (e.g. workaround, blockage, etc).

single sitemap file system is currently deployed which has the risk of hitting the max limit.

Attachment

Please drag and drop the necessary data files (e.g. screenshot, logs, etc)
below.
hollowaykeanho commented 2 years ago

Dependency https://github.com/ZORALab/Hestia/issues/16 developed. Ready to deploy.

hollowaykeanho commented 2 years ago

Constructed data structure add57b6d4562b3e773ef2fc21582f90ec843336d

hollowaykeanho commented 2 years ago

Implemented in f210ae457441eb26885f72fdf649bda1530038e9

Dropped media type URLs (https://developers.google.com/search/docs/advanced/sitemaps/image-sitemaps) as it does not comply to the standard protocol (https://www.sitemaps.org/protocol.html) where Google did not upstream their development to there.

hollowaykeanho commented 1 year ago

Released in https://github.com/ZORALab/Hestia/releases/tag/v1.0.0