Closed jasoncheng closed 5 years ago
Those URLs should filter
https://www.handelsblatt.com/dpa/sport/skispringen-zu-viel-regen-und-zu-warm-fis-sagtweltcup-in-titisee-ab/23716260.html?ticket=ST-4708340-epTTxnT6NHfsLP6osCsJ-ap5
https://www.handelsblatt.com/dpa/sport/skispringen-zu-viel-regen-und-zu-warm-fis-sagtweltcup-in-titisee-ab/23716260.html
and also image
let's forget about folder and compare w/ substring(0, 64)
much better now :)
still have other format need to deal with, let filter it tomorrow.
getting better now, strip site title+URL
Those URLs should filter
https://www.handelsblatt.com/dpa/sport/skispringen-zu-viel-regen-und-zu-warm-fis-sagtweltcup-in-titisee-ab/23716260.html?ticket=ST-4708340-epTTxnT6NHfsLP6osCsJ-ap5
https://www.handelsblatt.com/dpa/sport/skispringen-zu-viel-regen-und-zu-warm-fis-sagtweltcup-in-titisee-ab/23716260.html
and also image