Closed partingscientist closed 1 week ago
Can it use json-prune-fetch-response
instead of trusted-replace-fetch-response
and replace
? Also please add links to where this ads are found.
I can reproduce ads on search and this is working on my end:
tokopedia.com##+js(json-prune-fetch-response, 0.data.displayAdsV3, , propsToMatch, url:/graphql/Topads)
Also please add links to where this ads are found.
It will be added later, for the scope of this PR is quite extensive. It will take a while for me to double-check my notes and make sure I don't miss anything, hence the PR being a draft.
Can it use
json-prune-fetch-response
instead?
Regarding the first regex, your suggestion will break some pages (which will be specified later upon the completion of the PR draft) because some properties needs to exist for some pages. The point of replace
and trusted-replace-fetch-response
is to empty the contents of an array inside the JSON response instead, thus maintaining the existence of those properties.
The second and third regex are used because we're checking the value of a property, so trusted filters are needed, as far as I know.
For the first regex, trusted filter is required because deleting displayAdsV3 using json-prune will break the product page (/p).
Can you screenshot what is broken? I tried the above filter and followed the Promoted products on product page (/product)
steps but the page looks the same as without the filter for me.
Wait, did I write (/product) for the path instead of (/p)? I'll correct that.
Make sure you have the HTML filter on My Filters
to make sure the promoted products are served via a POST request instead of directly in the page script.
If you add tokopedia.com##+js(json-prune, 0.data.displayAdsV3)
and follow the STR for the product page (/p), you should see this which indicates no products are found.
The above filter is json-prune-fetch-response
so you can prune directly only on the chosen fetch
response.
##+js(json-prune, 0.data.displayAdsV3)
does indeed cause breakage but ##+js(json-prune-fetch-response, 0.data.displayAdsV3, , propsToMatch, url:/graphql/Topads)
seems fine for me. Can you check again?
Can you screenshot how it looks like on your end? It does not work on my end because that propsToMatch
should not match anything (A/B testing?). A promoted product on that specific page should have a megaphone symbol on its lower right.
I mean the above filter is for search
ads but won't cause breakages on product
pages. It's not meant to filter the ads on product
, but I think the same concept can be used for product
pages as well.
For product
ads, does this work on your side?
tokopedia.com##+js(json-prune-fetch-response, 0.data.displayAdsV3.data.[-].clickTrackUrl, , propsToMatch, /graphql/SearchProductQuery)
I mean the above filter is for
search
ads but won't cause breakages onproduct
pages.
Oh yeah of course, I originally preferred a general approach for all of them, because each page has different props, desktop and mobile each has a different props, and I might have missed some pages. Is it preferred to just enumerate each possible request and create a limited filter for each of them separately (combining if possible)? There would be a lesser chance of breakage obviously compared to my original approach.
Regardless, I'll convert this to a draft again, for I have a feeling that approach is preferred instead of using the first regex filter.
For
product
ads, does this work on your side?
Yeah that should work.
I think it's better to separate the cases to reduce the complexity for maintaining and mismatch / breakages. If combining them is simple enough, then I think it's OK. Also it's more preferable to do with non-trusted filters first if we can find some ways with them.
With new improvement of json-prune
, I think you can use [-] / {-}
in this case. For search, product and find, do these work on your side?
! search + product
tokopedia.com##+js(json-prune, 0.data.displayAdsV3.data.[-].__typename)
! https://www.tokopedia.com/find
tokopedia.com##+js(json-prune, 0.data.topads.data.[-].productClickURL)
The only one I can't check is the cart
one, I guess this filter is towards it?
||tokopedia.com/graphql/$xhr,replace=/\{"category_id"(?:(?!"ads":\{"id":"").)+?"ads":\{"id":"\d+".+?"__typename":"ProductCarouselV2"\},?//g
At first glance, I think it might be similar to the above cases. Can you check again or share its JSON?
I think this filter is for For You
section? I still see the ads with it
||tokopedia.com/graphql/$xhr,replace=/\{"(?:productS|s)lashedPrice"(?:(?!"isTopads":false).)+?"isTopads":true.+?"__typename":"recommendationItem"\},?//g
Is the response from this URL https://gql.tokopedia.com/graphql/RecommendationFeedQuery
? If it is, the regex doesn't match with the response on my side: https://file.lekture.top/json/tokopedia-J8B19zLg.json
Can you share which URL were you focusing on or the JSON on your side?
I think you can use [-] / {-} in this case.
The first regex is meant to empty the contents of an array value of a property, so these should work. I've replaced the first regex with its equivalents using json-prune-fetch-response
.
The second regex is for product carousels visible on the search page (/search). You can follow the STR for the page and look for the keyword Beli Lokal
on the page (I refer to it as a carousel because the segment is scrollable sideways on the mobile version of the site). Here's an example JSON request from /graphql/InspirationCarousel
.
https://pastes.dev/0ToA4lK1L5
The only difference between ad and non-ad products is the fact that all subproperties of ads
are non-empty strings instead of empty strings. As far as I know, I need regex for this. The current solution that I have is ugly (negative lookbehind), feel free to improve it.
The third regex is for the cart page (/cart). Here's an example JSON request from /graphql/RecomWidget
.
https://pastes.dev/dYVHMvwDjo
.
The only difference between ad and non-ad products is the value of a subproperty isTopads
being true
instead of false
. Again, I think I need regex for it as far as I know.
@partingscientist Does json-prune
work on your side? If you use json-prune-fetch-resonse
, it's better to narrow down which exact fetch URL you want to target with propsToMatch
. The more specific and less URLs being targeted, the better performance.
That should work.
I ended up needing negative lookbehind for the third regex to cover mobile carousels on the product page (/p). Here's a JSON test case (/graphql/ProductRecommendationQuery
) if you want to try to improve it.
https://pastes.dev/fNPR3NGAB7
Yeah, the others need regex. Since these are large regex, can you specify exact path for the filters? Instead of tokopedia.com/graphql/
, using something like /graphql/InspirationCarousel
would be better.
I think these 3
||tokopedia.com/graphql/productRecommendationWidget$xhr,replace=/\{"id":\d{9,11}(?:(?!"isTopads":false).)+?"isTopads":true.+?"__typename":"recommendationItem"\},?//g
||tokopedia.com/graphql/ProductRecommendationQuery$xhr,replace=/\{"id":\d{9,11}(?:(?!"isTopads":false).)+?"isTopads":true.+?"__typename":"recommendationItem"\},?//g
! Promoted products on cart page (/cart)
||tokopedia.com/graphql/productRecommendation|$xhr,replace=/\{"id":\d{9,11}(?:(?!"isTopads":false).)+?"isTopads":true.+?"__typename":"recommendationItem"\},?//g
can be combined to one
||tokopedia.com/graphql/productRecommendation$xhr,replace=/\{"id":\d{9,11}(?:(?!"isTopads":false).)+?"isTopads":true.+?"__typename":"recommendationItem"\},?//g
?
I originally split them for clarity, but it should be possible to combine them.
For trusted-replace-fetch-response
, using /\/graphql/(?:P|p)roductRecommendation/
for propsToMatch
should work, no? Or is it exact match only?
trusted-replace
does not need propsToMatch
, you just need to put the link at the last argument
You can use /\/graphql\/productRecommendation/i
I think it's good now
Can I merge it?
I need some time, I want to make sure I don't miss any corner cases. I'll let you know.
Well, I do found one.
https://regex101.com/r/eCo6jn/1
! Promoted products on mobile carousels of product page (/p) and cart page (/cart)
||tokopedia.com/graphql/productRecommendation$xhr,replace=/\{"id":\d{9,11}(?:(?!"isTopads":false).)+?"isTopads":true.+?"__typename":"recommendationItem"\},?//g
I can either
,?
=> ,
), which means the last item in a carousel will be missed if it is an ad, or||tokopedia.com/graphql/productRecommendation$xhr,replace=/\{"id":\d{9,11}(?:(?!"isTopads":false).)+?"isTopads":true.+?"__typename":"recommendationItem"\},//g
||tokopedia.com/graphql/productRecommendation$xhr,replace=/,\{"id":\d{9,11}(?:(?!"isTopads":false).)+?"isTopads":true(?:(?!"__typename":"recommendationItem").)+?"__typename":"recommendationItem"\}(?=\])//g
which does not look pretty.
Any preference?
Sidenote: Why would an e-commerce site place a promoted product at the very back instead of the very front? 🤷
How about this?
||tokopedia.com/graphql/productRecommendation$xhr,replace=/\{"id":\d{9,11}(?:(?!"isTopads":false).)+?"isTopads":true.+?"__typename":"recommendationItem"\}(,?)/{}\$1/g
That breaks the page. Unfortunately, we cannot empty the object; it has to be deleted.
Yeah then I think it's unavoidable.
I don't think I have anything else to add. Fingers crossed that should be everything.
Ok. I'll merge it.
Somehow I forgot about this.
@stephenhawk8054 Can you replace this
with the following?
tokopedia.com##+js(json-prune, [].data.displayAdsV3.data.[-].__typename)
tokopedia.com##+js(json-prune, [].data.TopAdsProducts.data.[-].__typename)
tokopedia.com##+js(json-prune, [].data.topads.data.[-].__typename)
In some rare instances, the promoted products may be served using multiple array elements.
URL(s) where the issue occurs
tokopedia.com
Describe the issue
To fully block promoted products and stores on
tokopedia.com
, especially on mobile site, trusted filters are required. This pull request is an attempt to address those as extensively as possible.This pull request covers
Screenshot(s)
The following is used to emphasise targeted elements.
Screenshots:
1. Home page (/) ![home-desktop](https://github.com/uBlockOrigin/uAssets/assets/115052854/7b686b24-2116-44a0-bc5c-fbcef0894e99) ![home-mobile](https://github.com/uBlockOrigin/uAssets/assets/115052854/af4b724b-a2ff-48d6-a9ca-b3b4c54e099b) 2. Search page (/search) ![search-desktop](https://github.com/uBlockOrigin/uAssets/assets/115052854/f7911d82-b7db-4d54-a24d-992af16d8ce4) ![search-mobile](https://github.com/uBlockOrigin/uAssets/assets/115052854/13aace66-9da7-4de8-bf34-f373896a0e95) 3. Product page (/p) ![product-desktop](https://github.com/uBlockOrigin/uAssets/assets/115052854/2843dd05-a415-421c-b9e9-81dd465ecf90) ![product-mobile](https://github.com/uBlockOrigin/uAssets/assets/115052854/063e5c85-4ae4-4deb-ac15-21fcb0290f48) 4. Find page (/find) ![find-desktop](https://github.com/uBlockOrigin/uAssets/assets/115052854/a1e4e149-679e-4186-a845-db4792ac73f6) ![find-mobile](https://github.com/uBlockOrigin/uAssets/assets/115052854/df6fd33d-746a-4658-b66f-404fcf279a93) 5. Cart page (/cart) ![cart-desktop](https://github.com/uBlockOrigin/uAssets/assets/115052854/e9333023-da7d-4752-adcb-1e5340785460) ![cart-mobile](https://github.com/uBlockOrigin/uAssets/assets/115052854/55e1a8a1-db59-4c4e-8cad-353858362912)Versions
Settings
Notes
Rough paper:
The first part of the proposed solution consists of three pruning filters used to empty the contents of an array inside a JSON response. The first filter is for the home page (/) and the search page (/search); the second filter is for the product page (/p); the third filter is for the find page (/find). The second part consists of seven regexes needed to remove response objects by checking the value of its property. In order, each filter is used to handle - desktop search page (/search) carousel, - mobile search page (/search) carousel, - mobile product page (/p) carousel, - mobile product page (/p) recommendation, - desktop cart page (/cart) recommendation, - mobile cart page (/cart) carousel, and - mobile cart page (/cart) recommendation The third part consists of a HTML filter used to remove promoted products data served directly when accessing some pages directly. This will force the site to fetch the needed data via POST request instead, which will be covered by the first three filters.Steps to reproduce:
For ease of investigation, you might find it beneficial to switch the site language into English using the language switcher on the footer of the site. Not every string is translated unfortunately, but it should be helpful. 1. Promoted products on home page (/) Note: requires being logged in with an account. - Open `tokopedia.com`. - Scroll below far enough until you can see a product section titled `For You`. - There should be promoted products marked with `Ad` on the lower right of the item. 2. Promoted products on search page (/search) - Open `tokopedia.com`. - On the search bar at the top of the page search for `ringke`. - There should be promoted store on the top of the page and promoted products marked with `Ad` on the lower right of the item. 3. Promoted products on product page (/p) - Open `tokopedia.com`. - Find the keyword `Kategori Pilihan` on the page and click on one of the product choices presented below the keyword. - There should be promoted store on the top of the page and promoted products below it, both marked with a megaphone symbol on the lower right of the listings. 4. Promoted products on find page (/find) - Open `tokopedia.com`. - Find the keyword `Lagi Trending` on the page and click on one of the product choices presented below the keyword. - There should be promoted products marked with `Ad` on the lower right of the item. 5. Promoted products on cart page (/cart) Note: requires being logged in with an account. - Open `tokopedia.com`. - Click the cart button on the top right corner of the page. - There should be promoted products marked with `Ad` on the lower right of the item. 6. Promoted products cache on the site - Open `https://www.tokopedia.com/search?q=ringke` directly. - There should be promoted store on the top of the page and promoted products marked with `Ad` on the lower right of the item. - If everything is done correctly, you should be able to find `displayAdsV3` using your browser inspector that indicates the promoted stores and products are served directly in the HTML instead of being from a POST request, which is the case in all of the previously mentioned possible cases.