yacy / yacy_search_server

Distributed Peer-to-Peer Web Search Engine and Intranet Search Appliance
http://yacy.net
Other
3.31k stars 422 forks source link

`crawler.userAgent.name` and `crawler.userAgent.string` ignored in the settings #635

Open GamePlayer-8 opened 2 months ago

GamePlayer-8 commented 2 months ago

Good afternoon,

I've got a problem with YaCy user agent config where I'm setting up user agent like so:

crawler.userAgent.name=chimmieyacy
crawler.userAgent.string=chimmieyacy

in /opt/yacy_search_server/DATA/SETTINGS/yacy.conf, but it's being ignored by the search server since my nginx shows a yacybot user agent instead:

- - fd4d:6169:6c63:6f70::1 - - [11/Apr/2024:12:07:19 +0000] "GET / HTTP/1.1" 200 5566 "-" "yacybot (-global; amd64 Linux 6.1.0-18-amd64; java 11.0.20.1; Etc/en) http://yacy.net/bot.html"

What should I do?

okybaca commented 2 months ago

it's probably a bug. but can you check, whether that setting remains in yacy.conf even after restart? it seems to me that some settings are rewriten upon restart, and the safest way to edit them is when yacy is down: edit the setting and start it on again.

GamePlayer-8 commented 2 months ago

I did that as well in yacy.conf manually while yacy container was turned off.

The configuration persists, but the search engine is still using yacybot as a user agent.

I've modified the code manually at the moment in ClientIdentification.java (commit info).