algolia / docsearch-scraper

DocSearch - Scraper
https://docsearch.algolia.com/
Other
309 stars 107 forks source link

feat(meta): handle comma-separated version #524

Closed s-pace closed 4 years ago

s-pace commented 4 years ago

This PR enables to use coma-separated token for docsearch:version meta tag.

The behaviour of the docsearch:version meta tag will be similar to the meta tag keyword defined from the HTML 5 spec.

The docsearch:version tag can be a set of comma-separated tokens, each of which is a version relevant to the page. These tokens must be compliant with the SemVer specification or only contain alphanumeric characters (e.g.latest, next, etc.). As facet filters, these version tokens are case-insensitive.

For example, all records extracted from a page with the following meta tag:

<meta name="docsearch:version" content="2.0.0-alpha.62,latest">

Will be tagged with the version:

version:["2.0.0-alpha.62" , "latest"]

This PR follows the ongoing work of the docusaurus team aiming at introducing aliases for version facebook/docusaurus#3393, facebook/docusaurus#3391

slorber commented 4 years ago

Thanks 👍

Wondering if some people have comas in their version names 😅

francoischalifour commented 4 years ago

Do we explicitly mention that we only support SemVer? If so, comma should work because they're not supposed to be valid SemVer versions.

s-pace commented 4 years ago

I would recommend to restrain version to semantic versioning format.

francoischalifour commented 4 years ago

Note that latest is not a valid SemVer range though, but not sure it matters much to check the validity of the filter:

image

Try it →

s-pace commented 4 years ago

I explicitly mention the use of these tag from https://github.com/algolia/docsearch-website/pull/49. All version tokens must be compliant with the SemVer specification or only contain alphanumeric characters (e.g.latest, next, etc.). As facet filters, these version tokens are case-insensitive.