nicolas-raoul / Wikipedia-Reliable-Sources

Configuration for WRS, the search engine that aims to only return results from reliable websites
https://en.wikipedia.org/wiki/User:Syced/Wikipedia_Reference_Search
MIT License
4 stars 3 forks source link

Update annotations.tsv #6

Closed Superb-Owl-Wiki closed 1 month ago

Superb-Owl-Wiki commented 2 months ago

first step merging Reliable Source Engine list over here

nicolas-raoul commented 1 month ago

Same for here, would you mind doing the same to the XML? Thanks a lot!

Superb-Owl-Wiki commented 1 month ago

Same for here, would you mind doing the same to the XML? Thanks a lot!

So I removed the urls that seemed concerning or otherwise might not be as reliable and replaced some of those. For the others, I haven't figured out how to get in the right format to integrate with your database, but if you want in the meantime, you can copy/paste the url column from this spreadsheet directly into the CSE and it should only add the ones not already included (this is my entire search engine): https://github.com/Superb-Owl-Wiki/Reliable-Source-Engine/blob/main/Reliable%20Sources.xlsx

nicolas-raoul commented 1 month ago

Editing the XML is a more future-proof solution than copy/pasting into the CSE UI, I think. Anyway I just finished performing the conversion via the low-tech solution screenshoted below. :-) You may want to double-check the resulting XML (committed).

For meta-information such as Opinion?/Paywall?/Notes I would suggest adding an XML comment (example: <!-- Opinion?:Y* / Paywall?:N / Notes:Can only filter by author -->) at the end of each XML line if needed.


Screenshot 2024-06-27 at 12 05 22

... followed by this command to remove duplicates: awk '!x[$0]++'