ooni / ooni.org

The ooni.org homepage and all cross organisational issues
https://ooni.org
Other
75 stars 62 forks source link

test-lists: Create script to automatically delete expired and parked domains #1227

Closed agrabeli closed 1 year ago

agrabeli commented 2 years ago

Given that the Citizen Lab test lists (https://github.com/citizenlab/test-lists/tree/master/lists) were originally created by Open Net Initiative researchers between 2008-2012, they include many URLs with expired and parked domains.

It would therefore be great if we could create a script that automatically detects and deletes URLs with expired and parked domains.

This would significantly simplify the test list review process of researchers, and it would also improve OONI measurement quality.

This activity has been included as an OONI challenge in Roskomsvoboda's DEMHACK hackathon (September 2022): https://demhack.ru/

If this activity is not implemented as part of the hackathon, the OONI team should pick it up.

[Update: 2023-03-15 - we did half of the work; please, see https://github.com/ooni/probe/issues/1826, which covers the remaining part of the work originally covered by this issue.]

bassosimone commented 1 year ago

I am going to close this issue as a duplicate because:

Because this issue covers both cases, it is a full duplicate of those two issues.