yacy / yacy_search_server

Distributed Peer-to-Peer Web Search Engine and Intranet Search Appliance
http://yacy.net
Other
3.41k stars 428 forks source link

How can I stop old websites appearing? #210

Open songproducer opened 6 years ago

songproducer commented 6 years ago

I've got sort by /date active and the top results are often really old even though they were indexed recently.

Is there a fq that can filter out urls that have old dates in them?

luccioman commented 6 years ago

Yes @songproducer , you can indeed modify the Date Profile in the RankingSolr_p.html page, using for example fq=last_modified:[NOW/DAY-10DAY TO *] to filter out documents whose last modification date is older than ten days ago.

But once again, don't forget that this filter query will apply only to results from requests to local and remote Solr indexes, but not to results obtained from local or remote RWI structure...

songproducer commented 6 years ago

@luccioman thanks! I ended up turning RWI back on and blacklisting individual sport URLS.

songproducer commented 6 years ago

@luccioman I'm still getting results that have been freshly indexed but have old dates in the url or description. For example when I search for news I'm getting results like this:

http://skalusa.org/news/?rkey=20160510LA94621&filter=6144

Luciano Bello - PhD student at Chalmers - Home News December 17, 2015: http://www.lucianobello.com.ar/

songproducer commented 6 years ago

Hi @luccioman I'm struggling to come up with a filter query that filters out old dates from the URL.

I'm adding this: &qf=url_paths_sxt: But not sure how to filter out the dates after that