I confirm that this contribution is made under the terms of the license found in the root directory of this repository's source tree and that I have the authority necessary to make this contribution on behalf of its copyright owner.
I suspect that the paragraph_index hasn`t been fed in a while, as running the build script added an extra "-e" to the front matter of the html-files, resulting in empty feed for me.
Improves the chunking to docsearch from pyvespa:
[x] Updated the bash script to fix the issue mentioned above.
[x] Removed empty jupyter cell references.
[x] Fixes too long chunks for reference-api.html by splitting on class methods.
[x] Fixes bad formatting due to double escaping of backslashes.
I confirm that this contribution is made under the terms of the license found in the root directory of this repository's source tree and that I have the authority necessary to make this contribution on behalf of its copyright owner.
I suspect that the paragraph_index hasn`t been fed in a while, as running the build script added an extra "-e" to the front matter of the html-files, resulting in empty feed for me.
Improves the chunking to docsearch from pyvespa:
reference-api.html
by splitting on class methods.Link to previous test run: https://github.com/vespa-engine/pyvespa/actions/runs/10370895936/job/28709859043 Seem to give better results for reference docs, eg: