DandelionSprout / adfilt

The place where I, DandelionSprout, store my web filter lists for countless topics, including my Nordic adblock list. As simple as that, really.
Other
1.3k stars 143 forks source link

Update LegitimateURLShortener.txt #277

Closed Nojuuu closed 2 years ago

Nojuuu commented 2 years ago

Removes parameter siteid from these:

https://www.barrons.com/articles/verizon-brady-dividend-stocks-51630700797?siteid=yhoof2

https://www.marketwatch.com/story/dutch-bros-to-offering-211-million-shares-in-ipo-priced-at-18-to-20-each-2021-09-07?siteid=yhoof2

iam-py-test commented 2 years ago

This looks ready to merge I just updated the data/id

DandelionSprout commented 2 years ago

This looks perfectly good to me, so I'll merge now.

Nojuuu commented 2 years ago

I made quick change, sorry

Nojuuu commented 2 years ago

I noticed that there was already /^siteid/i and that removing mod might be bad idea.

DandelionSprout commented 2 years ago

Now that I look closer at the "Files changed" tab, weren't there indeed supposed to be =mod removals too?

DandelionSprout commented 2 years ago

Okay, now the info and the changes seem to match up, from what I can see.

Nojuuu commented 2 years ago

Yeah, I removed it because I noticed this older discussion https://github.com/DandelionSprout/adfilt/discussions/163#discussioncomment-526415

I'll try to make better filter for those mod params later.

Nojuuu commented 2 years ago

@DandelionSprout @iam-py-test

||wsj.com/*^$removeparam=/^(mod=wsjhp_columnists_pos|mod=wsjheader_logo|mod=hp_trending_now_opn_pos|mod=hp_trending_now_video_pos|mod=nav_left_section|mod=economy_video_strap|mod=nav_top_section|mod=hp_minor_pos|mod=article_flashline|mod=hp_DAY_Theme|mod=nav_top_subsection|mod=automatedsubsection_trending_now_video_pos|mod=automatedsubsection_trending_now_article_pos|mod=error_page|mod=theme_coronavirus-ribbon|mod=wsjfooter|mod=CP_PRT_BRD_FTR|mod=WSJ|mod=CP_PRT_BRD_FTR|mod=mdstrip|mod=wsjheader|mod=wsjvidc_series|mod=hp_lead_pos|mod=breadcrumb|mod=hp_listc_pos|mod=podcasts_tile|mod=hp_jr_pos|mod=markets_lead_pos|mod=Home_MDW_MDC|mod=sports_lead_story|mod=hp_lista_pos|mod=latest_headlines|mod=article_relatedinline|mod=cxrecs_join|mod=series_seocareerguideinvesting|mod=md_home_pan_mkt_news|mod=md_home_mf_movers_quote|mod=world_trending_now_video_pos|mod=video_center|mod=md_home_overview_quote|mod=wsjcorphat|mod=search_trending_now_article_pos|mod=search_trending_now_video_pos|mod=searchresults_pos|mod=searchresults_viewallresults|mod=article_inline|mod=mktw|mod=hp_opin_pos|mod=bigtop-breadcrumb|mod=trending_now_opn_|mod=trending_now_news_)/

Would something like that be better? I might have missed some and I still need to go through other sites too.


Shorter version:

||wsj.com/*^$removeparam=/^(mod=wsjhp_|mod=hp_|mod=nav_|mod=economy_|mod=article_|mod=automatedsubsection_|mod=error_|mod=theme_|mod=wsjfooter|mod=CP_|mod=WSJ|mod=mdstrip|mod=wsjheader|mod=wsjvidc_|mod=breadcrumb|mod=podcasts_|mod=markets_|mod=Home_|mod=sports_|mod=latest_|mod=cxrecs_|mod=series_|mod=md_|mod=world_|mod=video_|mod=wsjcorphat|mod=search_|mod=searchresults_|mod=mktw|mod=bigtop-breadcrumb|trending_)/
iam-py-test commented 2 years ago

@Nojuuu I will try to look into that, but can't right now

Nojuuu commented 2 years ago

Option 1:

! —————— Remove all unnecessary mod parameters from 'Dow Jones & Company' sites —————— !
! NOTE: https://github.com/DandelionSprout/adfilt/discussions/163#discussioncomment-526415
!
! Example URLs: 
! https://www.wsj.com/articles/global-stock-markets-dow-update-09-17-2021-11631864534?mod=hp_lead_pos6
! https://www.marketwatch.com/story/here-are-two-large-tech-stocks-to-avoid-according-to-goldman-sachs-11631550925?mod=mw_more_headlines
! https://www.barrons.com/articles/supply-chain-issues-are-about-to-hit-earnings-what-to-expect-51631888409?mod=hp_LEADSUPP_1
! https://www.fnlondon.com/authors/9011?mod=article_byline
! https://www.dowjones.com/cookie-notice/?mod=DJnotice
! https://www.mansionglobal.com/articles/london-rents-hit-decade-low-227330?mod=top_markets_london_uk
! https://www.penews.com/?mod=qlinks
!               
$removeparam=/^(mod=wsjhp_columnists_pos|mod=wsjheader_logo|mod=hp_trending_now_opn_pos|mod=hp_trending_now_video_pos|mod=nav_left_section|mod=economy_video_strap|mod=nav_top_section|mod=hp_minor_pos|mod=article_flashline|mod=hp_DAY_Theme|mod=nav_top_subsection|mod=automatedsubsection_trending_now_video_pos|mod=automatedsubsection_trending_now_article_pos|mod=error_page|mod=theme_coronavirus-ribbon|mod=wsjfooter|mod=CP_PRT_BRD_FTR|mod=WSJ|mod=mdstrip|mod=wsjheader|mod=wsjvidc_series|mod=hp_lead_pos|mod=breadcrumb|mod=podcasts_tile|mod=hp_jr_pos|mod=markets_lead_pos|mod=Home_MDW_MDC|mod=sports_lead_story|mod=hp_lista_pos|mod=hp_listb_pos|mod=hp_listc_pos|mod=latest_headlines|mod=article_relatedinline|mod=cxrecs_join|mod=series_seocareerguideinvesting|mod=md_home_pan_mkt_news|mod=md_home_mf_movers_quote|mod=world_trending_now_video_pos|mod=video_center|mod=md_home_overview_quote|mod=wsjcorphat|mod=search_trending_now_article_pos|mod=search_trending_now_video_pos|mod=searchresults_pos|mod=searchresults_viewallresults|mod=article_inline|mod=mktw|mod=hp_opin_pos|mod=bigtop-breadcrumb|mod=trending_now_opn_|mod=trending_now_news_|mod=top_nav|mod=markets|mod=investing|mod=options|mod=barrons-on-marketwatch|mod=mw_more_headlines|mod=MW_article_top_stories|mod=home-page|mod=personal-finance|mod=MW_story_quote|mod=mw_quote_advanced|mod=economy-politics|mod=newsviewer_click|mod=mw_quote_recentlyviewed|mod=mw_quote_news|mod=retirement|mod=trending-tickers|mod=mw_latestnews|mod=refsymb_mw|mod=side_nav|mod=mw_footer|mod=bomw_article_logo|mod=moremw_bomw|mod=coronavirus|mod=economic-report|mod=federal-reserve|mod=market-snapshot|mod=mw_quote_tab|mod=consumer-products|mod=mortgages|mod=mutual-funds|mod=stocks|mod=us-market-data|mod=europe-market-data|mod=asia-market-data|mod=currencies-market-data|mod=cryptocurrency-market-data|mod=rates-market-data|mod=futures-market-data|mod=us-markets|mod=europe-middle-east-markets|mod=emerging-markets|mod=canadian-markets|mod=asia-markets|mod=latin-america|mod=market-data-center|mod=taxes-retirement|mod=nr_retwidget|mod=fire-retirement|mod=social-security|mod=cryptocurrencies|mod=fa_center|mod=dj-newswires|mod=bniim|mod=initial-public-offerings|mod=exchange-traded-funds|mod=bonds|mod=currencies|mod=futures|mod=cannabis-watch|mod=washington-watch|mod=inflation|mod=the-fed|mod=rex-nutting|mod=the-moneyist|mod=credit-cards|mod=travel|mod=search|mod=nearby_properties|mod=mg_search|mod=family-finances|mod=taxwatch|mod=careers|mod=real-estate-personal-finance|mod=bniir|mod=estate-planning|mod=help-me-retire|mod=real-estate-retirement|mod=mg_mw_virtual_listing|mod=where-should-i-retire|mod=how-to-invest|mod=top_markets|mod=podcasts_tile|mod=marketwatch-live-events|mod=loans|mod=online-courses|mod=insurance|mod=opinion|mod=bnbh|mod=MW_section_top_stories|mod=mw_ipocalendar|mod=mw_upgrades_downgrades|mod=BOL_LOGO|mod=BOL_TOPNAV|mod=hp_LEAD|mod=hp_LATEST|mod=hp_DAY_Theme|mod=hp_COMMENTARY|mod=hp_INTERESTS_commodities|mod=hp_columnists|mod=hp_ADVISOR_top|mod=faranking_subnav|mod=features|mod=hp_FOOTER|mod=BOL|mod=past_editions|mod=hp_PENTA|mod=article_flashline|mod=errorpage|mod=md_subnav|mod=md_usstk_overview_quote|mod=md_usstk_news|mod=article_commentary_tag|mod=DNH_S|mod=hp_DAY|mod=hpsubnav|mod=md_home_pan_mkt_news|mod=md_curr_hdr_view_all_companies|mod=quote_LOGO|mod=PENTA_NAV|mod=centennial|mod=articletype_trending_now_video_pos|mod=article_byline|mod=topStories|mod=qlinks|mod=footer|mod=menu|mod=subscribe-bottom|mod=fn-footer|mod=home-page|mod=DJnotice|mod=DJ|mod=mg|mod=home_hero|mod=pen|mod=homePage_topStories_specialReport|mod=sponsored_main|mod=djmc_SeoSite_brand)/,domain=wsj.com|store.wsj.com|marketwatch.com|barrons.com|fnlondon.com|dowjones.com|mansionglobal.com|penews.com
!
! —— A list of those that should not be removed:
! mod=djemalertNEWS
!
! ———————————————————————————————————————————————————————————————————————————————————— !

Option 2:

! —————— Remove all unnecessary mod parameters from 'Dow Jones & Company' sites —————— !
! NOTE: https://github.com/DandelionSprout/adfilt/discussions/163#discussioncomment-526415
!
! Example URLs: 
! https://www.wsj.com/articles/global-stock-markets-dow-update-09-17-2021-11631864534?mod=hp_lead_pos6
! https://www.marketwatch.com/story/here-are-two-large-tech-stocks-to-avoid-according-to-goldman-sachs-11631550925?mod=mw_more_headlines
! https://www.barrons.com/articles/supply-chain-issues-are-about-to-hit-earnings-what-to-expect-51631888409?mod=hp_LEADSUPP_1
! https://www.fnlondon.com/authors/9011?mod=article_byline
! https://www.dowjones.com/cookie-notice/?mod=DJnotice
! https://www.mansionglobal.com/articles/london-rents-hit-decade-low-227330?mod=top_markets_london_uk
! https://www.penews.com/?mod=qlinks
!      
$removeparam=mod,domain=wsj.com|store.wsj.com|marketwatch.com|barrons.com|fnlondon.com|dowjones.com|mansionglobal.com|penews.com
!
! —— A list of those that should not be removed:
@@mod=djemalertNEWS^$removeparam=mod
!
! ———————————————————————————————————————————————————————————————————————————————————— !
iam-py-test commented 2 years ago

Option 1:

* This one shouldn't remove any necessary mod params but will be a huge list and probably harder to maintain and there are probably still some that i've missed.
! —————— Remove all unnecessary mod parameters from 'Dow Jones & Company' sites —————— !
! NOTE: https://github.com/DandelionSprout/adfilt/discussions/163#discussioncomment-526415
!
! Example URLs: 
! https://www.wsj.com/articles/global-stock-markets-dow-update-09-17-2021-11631864534?mod=hp_lead_pos6
! https://www.marketwatch.com/story/here-are-two-large-tech-stocks-to-avoid-according-to-goldman-sachs-11631550925?mod=mw_more_headlines
! https://www.barrons.com/articles/supply-chain-issues-are-about-to-hit-earnings-what-to-expect-51631888409?mod=hp_LEADSUPP_1
! https://www.fnlondon.com/authors/9011?mod=article_byline
! https://www.dowjones.com/cookie-notice/?mod=DJnotice
! https://www.mansionglobal.com/articles/london-rents-hit-decade-low-227330?mod=top_markets_london_uk
! https://www.penews.com/?mod=qlinks
!               
$removeparam=/^(mod=wsjhp_columnists_pos|mod=wsjheader_logo|mod=hp_trending_now_opn_pos|mod=hp_trending_now_video_pos|mod=nav_left_section|mod=economy_video_strap|mod=nav_top_section|mod=hp_minor_pos|mod=article_flashline|mod=hp_DAY_Theme|mod=nav_top_subsection|mod=automatedsubsection_trending_now_video_pos|mod=automatedsubsection_trending_now_article_pos|mod=error_page|mod=theme_coronavirus-ribbon|mod=wsjfooter|mod=CP_PRT_BRD_FTR|mod=WSJ|mod=mdstrip|mod=wsjheader|mod=wsjvidc_series|mod=hp_lead_pos|mod=breadcrumb|mod=podcasts_tile|mod=hp_jr_pos|mod=markets_lead_pos|mod=Home_MDW_MDC|mod=sports_lead_story|mod=hp_lista_pos|mod=hp_listb_pos|mod=hp_listc_pos|mod=latest_headlines|mod=article_relatedinline|mod=cxrecs_join|mod=series_seocareerguideinvesting|mod=md_home_pan_mkt_news|mod=md_home_mf_movers_quote|mod=world_trending_now_video_pos|mod=video_center|mod=md_home_overview_quote|mod=wsjcorphat|mod=search_trending_now_article_pos|mod=search_trending_now_video_pos|mod=searchresults_pos|mod=searchresults_viewallresults|mod=article_inline|mod=mktw|mod=hp_opin_pos|mod=bigtop-breadcrumb|mod=trending_now_opn_|mod=trending_now_news_|mod=top_nav|mod=markets|mod=investing|mod=options|mod=barrons-on-marketwatch|mod=mw_more_headlines|mod=MW_article_top_stories|mod=home-page|mod=personal-finance|mod=MW_story_quote|mod=mw_quote_advanced|mod=economy-politics|mod=newsviewer_click|mod=mw_quote_recentlyviewed|mod=mw_quote_news|mod=retirement|mod=trending-tickers|mod=mw_latestnews|mod=refsymb_mw|mod=side_nav|mod=mw_footer|mod=bomw_article_logo|mod=moremw_bomw|mod=coronavirus|mod=economic-report|mod=federal-reserve|mod=market-snapshot|mod=mw_quote_tab|mod=consumer-products|mod=mortgages|mod=mutual-funds|mod=stocks|mod=us-market-data|mod=europe-market-data|mod=asia-market-data|mod=currencies-market-data|mod=cryptocurrency-market-data|mod=rates-market-data|mod=futures-market-data|mod=us-markets|mod=europe-middle-east-markets|mod=emerging-markets|mod=canadian-markets|mod=asia-markets|mod=latin-america|mod=market-data-center|mod=taxes-retirement|mod=nr_retwidget|mod=fire-retirement|mod=social-security|mod=cryptocurrencies|mod=fa_center|mod=dj-newswires|mod=bniim|mod=initial-public-offerings|mod=exchange-traded-funds|mod=bonds|mod=currencies|mod=futures|mod=cannabis-watch|mod=washington-watch|mod=inflation|mod=the-fed|mod=rex-nutting|mod=the-moneyist|mod=credit-cards|mod=travel|mod=search|mod=nearby_properties|mod=mg_search|mod=family-finances|mod=taxwatch|mod=careers|mod=real-estate-personal-finance|mod=bniir|mod=estate-planning|mod=help-me-retire|mod=real-estate-retirement|mod=mg_mw_virtual_listing|mod=where-should-i-retire|mod=how-to-invest|mod=top_markets|mod=podcasts_tile|mod=marketwatch-live-events|mod=loans|mod=online-courses|mod=insurance|mod=opinion|mod=bnbh|mod=MW_section_top_stories|mod=mw_ipocalendar|mod=mw_upgrades_downgrades|mod=BOL_LOGO|mod=BOL_TOPNAV|mod=hp_LEAD|mod=hp_LATEST|mod=hp_DAY_Theme|mod=hp_COMMENTARY|mod=hp_INTERESTS_commodities|mod=hp_columnists|mod=hp_ADVISOR_top|mod=faranking_subnav|mod=features|mod=hp_FOOTER|mod=BOL|mod=past_editions|mod=hp_PENTA|mod=article_flashline|mod=errorpage|mod=md_subnav|mod=md_usstk_overview_quote|mod=md_usstk_news|mod=article_commentary_tag|mod=DNH_S|mod=hp_DAY|mod=hpsubnav|mod=md_home_pan_mkt_news|mod=md_curr_hdr_view_all_companies|mod=quote_LOGO|mod=PENTA_NAV|mod=centennial|mod=articletype_trending_now_video_pos|mod=article_byline|mod=topStories|mod=qlinks|mod=footer|mod=menu|mod=subscribe-bottom|mod=fn-footer|mod=home-page|mod=DJnotice|mod=DJ|mod=mg|mod=home_hero|mod=pen|mod=homePage_topStories_specialReport|mod=sponsored_main|mod=djmc_SeoSite_brand)/,domain=wsj.com|store.wsj.com|marketwatch.com|barrons.com|fnlondon.com|dowjones.com|mansionglobal.com|penews.com
!
! —— A list of those that should not be removed:
! mod=djemalertNEWS
!
! ———————————————————————————————————————————————————————————————————————————————————— !

Option 2:

* I think this one is easier to maintain as it removes all of them by default and we would just keep adding the ones that should be whitelisted.
! —————— Remove all unnecessary mod parameters from 'Dow Jones & Company' sites —————— !
! NOTE: https://github.com/DandelionSprout/adfilt/discussions/163#discussioncomment-526415
!
! Example URLs: 
! https://www.wsj.com/articles/global-stock-markets-dow-update-09-17-2021-11631864534?mod=hp_lead_pos6
! https://www.marketwatch.com/story/here-are-two-large-tech-stocks-to-avoid-according-to-goldman-sachs-11631550925?mod=mw_more_headlines
! https://www.barrons.com/articles/supply-chain-issues-are-about-to-hit-earnings-what-to-expect-51631888409?mod=hp_LEADSUPP_1
! https://www.fnlondon.com/authors/9011?mod=article_byline
! https://www.dowjones.com/cookie-notice/?mod=DJnotice
! https://www.mansionglobal.com/articles/london-rents-hit-decade-low-227330?mod=top_markets_london_uk
! https://www.penews.com/?mod=qlinks
!      
$removeparam=mod,domain=wsj.com|store.wsj.com|marketwatch.com|barrons.com|fnlondon.com|dowjones.com|mansionglobal.com|penews.com
!
! —— A list of those that should not be removed:
@@mod=djemalertNEWS^$removeparam=mod
!
! ———————————————————————————————————————————————————————————————————————————————————— !

@Nojuuu I don't know...

@@mod=djemalertNEWS^$removeparam=mod

Maybe this should check for ? or & somewhere just encase there is something like https://wsj.com?mod=somethingtrackery#mod=djemalertNEWS Probably unlikely but possible I think this is @DandelionSprout's decision, but I kindof like 2 I don't really have the ability to check right now but maybe sometime tomorrow/day after

Nojuuu commented 2 years ago

Maybe this should check for ? or & somewhere just encase there is something like https://wsj.com?mod=somethingtrackery#mod=djemalertNEWS Probably unlikely but possible

@@/(?:[?]|[&])mod=djemalertNEWS/$removeparam=mod Like that?

DandelionSprout commented 2 years ago

I'm inclined to go with option 2 for now. Anyone in favour?

Nojuuu commented 2 years ago

Yeah, seems good to me.

iam-py-test commented 2 years ago

@Nojuuu @DandelionSprout I can add it if Imre is ok (or he can add it)

Nojuuu commented 2 years ago

@DandelionSprout You forgot to whitelist mod=djemalertNEWS

DandelionSprout commented 2 years ago

Oh, right. I mistakenly thought that whitelisting was unnecessary when using option 2. Give me 3min.

DandelionSprout commented 2 years ago

As for @@/(?:[?]|[&])mod=djemalertNEWS/$removeparam=mod, it may be possible to use @@^mod=djemalertNEWS^$removeparam=mod, but I'd have to finely calculate how big of an improvement it is over @@mod=djemalertNEWS^$removeparam=mod

Nojuuu commented 2 years ago

You can use this to check if it is working correctly: https://www.wsj.com/articles/blockchain-com-raises-300-million-as-investors-find-other-ways-into-bitcoin-11616576413?mod=djemalertNEWS

DandelionSprout commented 2 years ago

Seems to work indeed, so I'll modify my initial fix now.

Yuki2718 commented 2 years ago

https://store.wsj.com/shop/emea/wsjsseur21/?trackingCode=aaqwghtc or just https://store.wsj.com/shop/emea/wsjsseur21/ redirects to https://store.wsj.com/shop/apac/jp/jwsjss21/?inttrackingCode=aaqwimzi&icid=JJ_ON_JHP_ACQ_NA in my region:

wsj