ssborbis / ContextSearch-web-ext

Search engine manager for modern browsers
318 stars 35 forks source link

Extract substring after stop word #637

Closed Parvares closed 1 year ago

Parvares commented 1 year ago

Hi Mike, may a ask you a question? Given this template URL in ContextSearch:

https://www.bibliotechediroma.it/opac/query/%s?context=tmatm

I need to extract the substrings after these stop-words (determinative articles) at the beginning of the input string (when they are present): Il, Lo, La, I, Gli, Le, L'

Here are some examples:

Lavita sulla terra e il futuro del genere umano > vita sulla terra e il futuro del genere umano Ilnome della rosa > nome della rosa L'avaro > avaro Laglobalizzazione e i suoi oppositori > globalizzazione e i suoi oppositori Idemoni > demoni

I found something similar here.

Thanks very much!

ssborbis commented 1 year ago

if the examples you give are contained in the search string %s you can use the Modify Search Terms field in the search engine edit modal to replace those leading strings.

/regex/replacement/[giym]

Something like ( untested ) /^(Il |Lo |La |I |Gli |Le |L')//g

mind the spaces, and check https://regex101.com/ for reference

Parvares commented 1 year ago

/^(Il |Lo |La |I |Gli |Le |L')//g

Thanks Mike, it doesn't seem working well...

ssborbis commented 1 year ago

/^(Il |Lo |La |I |Gli |Le |L')//g Thanks Mike, it doesn't seem working...

What is your search string? It seems to be working for me

Parvares commented 1 year ago

Strangely it works from some website (from github), but not from all ones: not from google search for example.

ssborbis commented 1 year ago

Could you clarify a little.

When you say it's not working "from" a website, are you saying if you select text on those particular websites ( google for instance ) and search through this addon, using the engine with the template https://www.bibliotechediroma.it/opac/query/%s?context=tmatm, the search does not work / the search terms are not modified?

Parvares commented 1 year ago

Yes, I mean this.

ssborbis commented 1 year ago

Is the problem due to case sensitivity or a leading space?

Parvares commented 1 year ago

Correct, that was the problem, now it works perfectly, thank again , Mike!

/^(Il |Lo |La |I |Gli |Le |L')//gi

ssborbis commented 1 year ago

No prob. Cheers!