jasoncheng / dica

This is a very simple Friendica android client, wrote by Android Kotlin
GNU General Public License v3.0
14 stars 2 forks source link

duplicate content filter #75

Closed jasoncheng closed 5 years ago

jasoncheng commented 5 years ago

Those URLs should filter

https://www.handelsblatt.com/dpa/sport/skispringen-zu-viel-regen-und-zu-warm-fis-sagtweltcup-in-titisee-ab/23716260.html?ticket=ST-4708340-epTTxnT6NHfsLP6osCsJ-ap5

https://www.handelsblatt.com/dpa/sport/skispringen-zu-viel-regen-und-zu-warm-fis-sagtweltcup-in-titisee-ab/23716260.html

and also image

15603

jasoncheng commented 5 years ago

let's forget about folder and compare w/ substring(0, 64)

jasoncheng commented 5 years ago

much better now :)

15604

jasoncheng commented 5 years ago

still have other format need to deal with, let filter it tomorrow.

jasoncheng commented 5 years ago

getting better now, strip site title+URL

15978