skohub-io / skohub-vocabs

A lightweight tool to publish SKOS Vocabularies
https://skohub.io/
Apache License 2.0
34 stars 25 forks source link

127 index all pref labels #152

Closed sroertgen closed 2 years ago

sroertgen commented 3 years ago

Resolves #127.

Now all prefLabels should be indexed.

With line:

https://github.com/skohub-io/skohub-vocabs/blob/cb72bdf51d32ae314bde2ae2e0b55484364aa9e6/src/searchIndex.js#L21

it should also be possible to add further skos properties to FlexSearch index as wanted in #128

I deleted the test for node v10. As stated in #151, we don't want to support node v10 anymore. One check failed because I'm using the method Object.fromEntries which is supported in node since v12 (see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/fromEntries#browser_compatibility).

acka47 commented 3 years ago

@dr0i please deploy to test for me to do a functional review. @literarymachine kindly agreed to review the code later on.

dr0i commented 3 years ago

Deployed to test.You can go trigger new build @acka47 .

acka47 commented 3 years ago

I made a new build for a HCRT fork at https://test.skohub.io/acka47/hcrt/heads/master/index.de.html and it all works as expected: Depending on my choice of language at the top (de/en), I can search the prefLabels of that language.

(However, I noticed that results will only show up if a string matches thre beginning of a word, e.g. searching for "spiel" won't give me "Lernspiel" back. That might be a limitation of FlexSearch or – if FlexSearch can handle this – we should create another issue to address this shortcoming.)

sroertgen commented 3 years ago

Thanks @literarymachine for looking at the code. Please see this commit message for my intention: https://github.com/skohub-io/skohub-vocabs/commit/12b9c27a1c357312e2c146405d1cd60b9d687156

sroertgen commented 3 years ago

(However, I noticed that results will only show up if a string matches thre beginning of a word, e.g. searching for "spiel" won't give me "Lernspiel" back. That might be a limitation of FlexSearch or – if FlexSearch can handle this – we should create another issue to address this shortcoming.)

I think this might be possible. We might have to change the tokenizer: https://github.com/nextapps-de/flexsearch/tree/0.6.32#tokenizer

Should this be changed here or do we want to open a seperate issue for that?

acka47 commented 3 years ago

Should this be changed here or do we want to open a seperate issue for that?

I'll open another issue.

sroertgen commented 3 years ago

In general I am wondering if this would be a good time to (a) switch to a newer version of flexsearch and (b) perhaps use the built-in multi-lingual capabilities to build separate indexes for each language that we could then lazy-load?

Thanks for your feedback @literarymachine.

I will check out the newer flexsearch version and have a look at the multi-language capabilities. I will also test on nested lists.

sroertgen commented 2 years ago

@literarymachine

The v0.7 of flexsearch had some breaking changes. As well it has some issues using the export function (see. e.g. https://github.com/nextapps-de/flexsearch/issues/235). As long as this is not fixed I suggest we fix things in v0.6 and will migrate later.

Based on your comments, especially the lazy loading of the index, I took a much simpler approach, which is also less intrusive. It seems to work on nested lists as well.

If you have time I appreciate your feedback.

sroertgen commented 2 years ago

@dr0i would you mind deploying to test?

dr0i commented 2 years ago

@sroertgen with https://github.com/skohub-io/skohub-vocabs/issues/115 you should be able to do this on your own. You just have to reset the dev branch with your code and push to dev. If this doesn't work don't hesitate to let me know.

sroertgen commented 2 years ago

Thank you! I will try out

sroertgen commented 2 years ago

@literarymachine would you mind having a look at the changes?

acka47 commented 2 years ago

+1 I tested it both with https://test.skohub.io/acka47/hcrt/heads/master/w3id.org/kim/hcrt/scheme.de.html and with nested lists at https://test.skohub.io/acka47/hochschulfaechersystematik/heads/add-english-translations/w3id.org/kim/hochschulfaechersystematik/scheme.en.html and it both works as expected.

acka47 commented 2 years ago

@dr0i please deploy to production.

dr0i commented 2 years ago

Deployed to production. Closed.