foo-git / rewe-discounts

Grabs current REWE discounts and saves them in a markdown file || Holt sich aktuelle REWE-Angebote und exportiert sie in eine Markdown-Liste
GNU General Public License v3.0
92 stars 11 forks source link

Breaking API change 03/2024 - 404 Error #19

Open foo-git opened 7 months ago

foo-git commented 7 months ago

The new API introduced in v2.6 (#17, #18) seems to be broken due to a change by REWE, it now yields a 404 error:

Traceback (most recent call last): File "[...]/rewe_discounts.py", line 294, in less_elegant_query data = scraper.get(url).json() ^^^^^^^^^^^^^^^^^^^^^^^ File "[...]/venv/lib/python3.11/site-packages/requests/models.py", line 975, in json raise RequestsJSONDecodeError(e.msg, e.doc, e.pos) requests.exceptions.JSONDecodeError: Extra data: line 1 column 5 (char 4) FAIL: Unknown error while fetching discounts from https://www.rewe.de/api/all-stationary-offers/[...], maybe a typo or the server rejected the request.

Right now, I have no solution available and the script will not work. If you find the correct API url, please let me know.

knautschka commented 7 months ago

Hi, with the linked discussion thread I found out what still works: You can build the link for the API with the market-id from local REWE stores. So for example https://shop.rewe.de/api/products/?search=&market=1766005 should work and list all products of the specific market in a JSON. (Of course you could add a search term as an addtion in the URL if you'd like).

Also, a note: I'm not sure if they changed the market-ids in general. When I used your script I got a market-id back with even the right adress of the market etc., so I thought it was right, but it didn't work. So instead I went to the REWE website, set a local market there manually over the UI and the URL looked like this: https://www.rewe.de/marktseite/dortmund/1766005/rewe-dortmund-husenerstr-50/ So I tried what I thought should be the id (1766005) in the URL and it actually worked. The JSON you get lists also the prices for the products, which was the feature I was looking for.

Don't know if this will help you and if this is the solution you had in mind. But anyway I wanted to try to because your work here helped me anyway! Never had thought you could use the API of REWE just like that. Keep up the good work! :)

foo-git commented 7 months ago

If I interpret the first link of yours correctly, it shows all products from the Rewe Lieferdienst, which is unfortunately not a list of current discounts. My Rewe store for "testing the script" yields a NO_HIT response, so the URL is not globally valid.

Regarding the market-ids, I could not reproduce your finding, as running the script with the PLZ from your store yields the correct market-id. Did you use the correct PLZ?

./rewe_discounts.py --list-markets 44319 ID Location 1766005: Filips Einzelhandels KG, Husenerstr. 50, 44319 Dortmund 562336: Carsten Engel e.K., Wickeder Hellweg 100 - 104, 44319 Dortmund 320195: Filips Supermarkt GmbH & Co. KG, Asselner Hellweg 94, 44319 Dortmund

So thanks for your much appreciated feedback, but it's not the solution yet :(

knautschka commented 7 months ago

About the market-ids: Sorry, you're right! I actually had a typo and didn't notice... :D

You're right, unfortunely my link doesn't show the current discounts. I'll keep trying to find a way to get them! :)

knautschka commented 7 months ago

I might have found something that could lead to a solution: I looked manually over the UI and noticed that when you click on a product that is in discount (for example an item on https://www.rewe.de/angebote/dortmund/1766005/rewe-dortmund-husenerstr-50/?week=current), there is a request for a JSON for the product you clicked on that can be requestet with for example https://www.rewe.de/api/offer-details/15962993?wwIdent=1766005. The ending of the URL seems to be made of the product-id and the market-id.

This might be a good start. Now we have to find out how all discounted products can be requestet at once.

What I didn't find yet: At the overview of the discounted products I don't find such a request for the data although it kind of has to be there. So I don't know how to make the request yet.

I will try it further but maybe someone else will be faster than me with the API-request I posted.

Edit: Noticed that you mentioned the URL-scheme in the other issue topic. But maybe it's at least good to know that it still works.

modelD-svg commented 7 months ago

I've been using the API you call "less elegant" for some time now and found this issue today, looking into why it stopped working. From what I can tell, all APIs (both browser and mobile) are now using fully cloudflare'd, WAF'd and fingerprinted endpoints. If you're curious, the app uses mobile-clients-api.rewe.de/api/stationary-app-offers/<mid> with seemingly heavy fingerprinting of course.

I've mostly given up on fighting cloudflare for projects that "just need to run", so I've gone back to just getting the raw html and parsing it with soup. This requires a vm and a sketchy ahk script, but I already have those things anyways and that has been working mostly well for similar projects for eons.

foo-git commented 7 months ago

Maybe the VM way is the way to go. Incidentally, for v1.0 I used selenium and soup to get and process the raw html. From v2.0 onwards I switched to the APIs.

I'll check if the selenium approach still works (although it seriously inflates the dependencies). I can't give an estimation on the timeline, as I'm busy with other tasks at the moment.

modelD-svg commented 7 months ago

I suspect you will have issues with cloudflare using selenium aswell, but good luck nontheless.

huskycgn commented 7 months ago

I've built something similar a few months ago - unfortuntately also stopped working. :(

BabyIsh88 commented 5 months ago

Addon not Working.

torbenpfohl commented 4 months ago

I took a look at the rewe app which uses these two endpoints:

https://mobile-api.rewe.de/api/v3/market/search?search="zipcode" for getting market ids in the area. https://mobile-clients-api.rewe.de/api/stationary-app-offers/"market-id" for getting all offers for a market.

You have to use the same headers that the app uses and you have to specify a certificate and private key (both are in the rewe.apk). With that I got a working version again.

But since I'm not sure if you are allowed to distribute the certificate and private key I haven't made a pull request yet. (Maybe somebody has some insight into that issue.) For now I just added a description on how to get the private key and the certificate. (in my fork)

modelD-svg commented 4 months ago

I took a look at the rewe app which uses these two endpoints:

https://mobile-api.rewe.de/api/v3/market/search?search="zipcode" for getting market ids in the area. https://mobile-clients-api.rewe.de/api/stationary-app-offers/"market-id" for getting all offers for a market.

You have to use the same headers that the app uses and you have to specify a certificate and private key (both are in the rewe.apk). With that I got a working version again.

But since I'm not sure if you are allowed to distribute the certificate and private key I haven't made a pull request yet. (Maybe somebody has some insight into that issue.) For now I just added a description on how to get the private key and the certificate. (in my fork)

Very nice! May I ask how you debugged the issue?

ByteSizedMarius commented 4 months ago

Have confirmed that calling the api works using the certs. I started automating the process (both extracting the certificates but also extracting the password) described by @torbenpfohl in powershell as a learning excercise (currently trying to get a bit better at ps). It's currently quite messy but if anyones interested in beta-testing it, please let me know. Otherwise I will probably publish it at some point in june. Personally, I would advise against publishing the certificates

torbenpfohl commented 4 months ago

@modelD-svg Was a bit messy. Basically I decompiled the apk with jadx and looked around in the code (with jadx-gui). Searching for "Request", "GET" and so on. Renaming a lot of functions for clarity. At some point I learned about frida and started to hook some basic networking classes - like java.net.Socket and the Conscrypt functions - first to look at the arguments being passed and the return value, and than to look at the stacktrace. The stacktrace limited the class I had to look at further. And from those classes it became clearer and clearer what class prepared the GET-request. At last I hooked (with frida) a function in that class that was being called a lot and logged the class-parameters (url and headers).

The certificate and private key I found while looking through the resources of the decompiled apk (decompiled with apktool); there I found the mtls_prod.pfx file which was password protected. But searching for mtls_prod in the source code gave only a few classes and in one of them was the password (as a integer-array).

But all in all I took a lot of time and poking around in the source code + hooking a lot of functions. (was my first reverse engineering project though)

@ByteSizedMarius Thank you for the assessment and for testing! I mostly use a Linux distro without Powershell. But I look forward to your script. Maybe I can adapt it for bash.

foo-git commented 4 months ago

@torbenpfohl, thanks for your great work. I allowed myself to put a link to your repository in the README.md to direct users to your fork.

As stated there, I'm currently not able to rewrite this program, so in case you want to create a new main repository for further development, go ahead.

ByteSizedMarius commented 4 months ago

As stated there, I'm currently not able to rewrite this program, so in case you want to create a new main repository for further development, go ahead.

I will start adding to my repository over the weekend, however I'll do a first draft in Go. If someone else (maybe torben) wants to maintain a python script, I'll just do Go, otherwise I'll do both at some point

torbenpfohl commented 4 months ago

Added a python script that gets the key and certificate. But I haven't done extensive testing yet.

ByteSizedMarius commented 4 months ago

Added a python script that gets the key and certificate. But I haven't done extensive testing yet.

same :)

foo-git commented 4 months ago

As stated there, I'm currently not able to rewrite this program, so in case you want to create a new main repository for further development, go ahead.

I will start adding to my repository over the weekend, however I'll do a first draft in Go. If someone else (maybe torben) wants to maintain a python script, I'll just do Go, otherwise I'll do both at some point

Thanks @ByteSizedMarius, I added a link in the README to your repository as well.

paulschatt commented 4 months ago

Does anyone know why they don't include GTINs/EANs in the mobile api? I have products in my database with their GTIN so that I can compare offers across different supermarkets. Unfortunately the API that provides GTINs doesn't work anymore

ByteSizedMarius commented 4 months ago

Does anyone know why they don't include GTINs/EANs in the mobile api? I have products in my database with their GTIN so that I can compare offers across different supermarkets. Unfortunately the API that provides GTINs doesn't work anymore

Don't know -- probably because they don't need it for the discounts specifically ;)

But theres a workaround: The discount api returns an article-no, for example

Jacobs Tassimo Kapseln Big Pack Morning Kaffee XL, je 163,8-g-Pckg. (1 kg = 24.36) [...] false {3,99 € Aktion} <nil> {<nil>  [] [{Produktdetails [Art.-Nr.: 7181145 Hersteller: JACOBS]}

You can then just query this number to get the ean, like this:

https://mobile-clients-api.rewe.de/api/products?query=7181145&page=1&objectsPerPage=20&sorting=RELEVANCE_DESC

This returns the ean

[...]  
"articleId": "8711000390757",
[...]

hope that helps!

I also probably wouldn't say too loudly that you want to compare across stores because I suspect thats what they specifically don't want you to do ;) wouldn't want them to lock down even further ^^

paulschatt commented 4 months ago

Does anyone know why they don't include GTINs/EANs in the mobile api? I have products in my database with their GTIN so that I can compare offers across different supermarkets. Unfortunately the API that provides GTINs doesn't work anymore

Don't know -- probably because they don't need it for the discounts specifically ;)

But theres a workaround: The discount api returns an article-no, for example


Jacobs Tassimo Kapseln Big Pack Morning Kaffee XL, je 163,8-g-Pckg. (1 kg = 24.36) [...] false {3,99 € Aktion} <nil> {<nil>  [] [{Produktdetails [Art.-Nr.: 7181145 Hersteller: JACOBS]}

You can then just query this number to get the ean, like this:


https://mobile-clients-api.rewe.de/api/products?query=7181145&page=1&objectsPerPage=20&sorting=RELEVANCE_DESC

This returns the ean


[...]  

"articleId": "8711000390757",

[...]

hope that helps!

I also probably wouldn't say too loudly that you want to compare across stores because I suspect thats what they specifically don't want you to do ;) wouldn't want them to lock down even further ^^

Thank you! Hahah yes, probably you are right! Its a shame that it is made difficult on purpose for consumers.

Bit-Barron commented 3 months ago

hey, is it possible to fetch products? i already tried this: https://mobile-clients-api.rewe.de/api/products?query=7181145&page=1&objectsPerPage=20&sorting=RELEVANCE_DESC but it isnt working for me i get a error { "detail": "Failed to fetch categories: Client error '404 Not Found' for url 'https://mobile-clients-api.rewe.de/api/products?query=7181145&page=1&objectsPerPage=20&sorting=RELEVANCE_DESC'\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/404" }

ByteSizedMarius commented 3 months ago

hey, is it possible to fetch products? i already tried this: https://mobile-clients-api.rewe.de/api/products?query=7181145&page=1&objectsPerPage=20&sorting=RELEVANCE_DESC but it isnt working for me i get a error { "detail": "Failed to fetch categories: Client error '404 Not Found' for url 'https://mobile-clients-api.rewe.de/api/products?query=7181145&page=1&objectsPerPage=20&sorting=RELEVANCE_DESC'\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/404" }

For the exact url you posted, I get 400 because of some missing headers (with the required certificates). You should get a 403 when missing the certs. Honestly no clue how you could be getting a 404 I'll add these endpoints to my lib over the next couple of days, you can maybe check again then

torbenpfohl commented 3 months ago

"ruleVersion": "2" gets me from a 400 response to a 404. And I think you need to set some specific market id, zipcode and service type. As well as some extra header(s) which I don't remember right now. I'm back home tomorrow and look it up.

ByteSizedMarius commented 3 months ago

"ruleVersion": "2" gets me from a 400 response to a 404. And I think you need to set some specific market id, zipcode and service type. As well as some extra header(s) which I don't remember right now. I'm back home tomorrow and look it up.

no need to check as I was just playing around with them :)

these are the special headers required for the request:

"rd-service-types": "PICKUP",
"rd-customer-zip":  "00000",
"rd-postcode":      "00000",
"rd-market-id":     marketID,

the rest is optional. zips can be anything, just not empty (they are only used if service-type is delivery). the other headers are like all the other requests. good call with the ruleVersions; i just never include them at all. probably just internal api versioning.

edit: sorry, youre right. marketid is required for the /products endpoint, just not for /shop-overview (which I was just looking at) marketid is 831002 for example

Bit-Barron commented 3 months ago

hey, is it possible to fetch products? i already tried this: https://mobile-clients-api.rewe.de/api/products?query=7181145&page=1&objectsPerPage=20&sorting=RELEVANCE_DESC but it isnt working for me i get a error { "detail": "Failed to fetch categories: Client error '404 Not Found' for url 'https://mobile-clients-api.rewe.de/api/products?query=7181145&page=1&objectsPerPage=20&sorting=RELEVANCE_DESC'\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/404" }

For the exact url you posted, I get 400 because of some missing headers (with the required certificates). You should get a 403 when missing the certs. Honestly no clue how you could be getting a 404 I'll add these endpoints to my lib over the next couple of days, you can maybe check again then hey, now i just get { "detail": "HTTP error: Client error '400 Bad Request' for url 'https://mobile-clients-api.rewe.de/api/products?categorySlug=katzenfutter&objectsPerPage=60&page=1&query=asdasd'\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400" }

ByteSizedMarius commented 3 months ago

hey, is it possible to fetch products? i already tried this: https://mobile-clients-api.rewe.de/api/products?query=7181145&page=1&objectsPerPage=20&sorting=RELEVANCE_DESC but it isnt working for me i get a error { "detail": "Failed to fetch categories: Client error '404 Not Found' for url 'https://mobile-clients-api.rewe.de/api/products?query=7181145&page=1&objectsPerPage=20&sorting=RELEVANCE_DESC'\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/404" }

For the exact url you posted, I get 400 because of some missing headers (with the required certificates). You should get a 403 when missing the certs. Honestly no clue how you could be getting a 404 I'll add these endpoints to my lib over the next couple of days, you can maybe check again then hey, now i just get { "detail": "HTTP error: Client error '400 Bad Request' for url 'https://mobile-clients-api.rewe.de/api/products?categorySlug=katzenfutter&objectsPerPage=60&page=1&query=asdasd'\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400" }

you need a marketid for the products endpoint, just corrected my response. sorry

Bit-Barron commented 3 months ago

hey, is it possible to fetch products? i already tried this: https://mobile-clients-api.rewe.de/api/products?query=7181145&page=1&objectsPerPage=20&sorting=RELEVANCE_DESC but it isnt working for me i get a error { "detail": "Failed to fetch categories: Client error '404 Not Found' for url 'https://mobile-clients-api.rewe.de/api/products?query=7181145&page=1&objectsPerPage=20&sorting=RELEVANCE_DESC'\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/404" }

For the exact url you posted, I get 400 because of some missing headers (with the required certificates). You should get a 403 when missing the certs. Honestly no clue how you could be getting a 404 I'll add these endpoints to my lib over the next couple of days, you can maybe check again then hey, now i just get { "detail": "HTTP error: Client error '400 Bad Request' for url 'https://mobile-clients-api.rewe.de/api/products?categorySlug=katzenfutter&objectsPerPage=60&page=1&query=asdasd'\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400" }

you need a marketid for the products endpoint, just corrected my response. sorry

so how should the request look like?

ByteSizedMarius commented 3 months ago

so how should the request look like?

URL:
https://mobile-clients-api.rewe.de/api/products?categorySlug=grillsaison&objectsPerPage=30&page=1&query=

Headers:
A-B-Test-Groups: productlist-citrusad
Connection: Keep-Alive
Correlation-Id: 03c04a7f-f3b2-45e7-a015-168f672c7341
Host: mobile-clients-api.rewe.de
Rd-Customer-Zip: 67065
Rd-Is-Lsfk: false
Rd-Market-Id: 831002
Rd-Postcode: 67065
Rd-Service-Types: PICKUP
Rdfa: 3d85e18e-d6df-4f53-8e71-4b3d68c1b3ee
User-Agent: REWE-Mobile-Client/3.18.5.33032 Android/14 Phone/Samsung_SM-S911B
X-Rd-Customer-Zip: 
X-Rd-Market-Id: 
X-Rd-Service-Types: UNKNOWN

some of these headers are optional, this is just what I generate currently

Bit-Barron commented 3 months ago

so how should the request look like?

URL:
https://mobile-clients-api.rewe.de/api/products?categorySlug=grillsaison&objectsPerPage=30&page=1&query=

Headers:
A-B-Test-Groups: productlist-citrusad
Connection: Keep-Alive
Correlation-Id: 03c04a7f-f3b2-45e7-a015-168f672c7341
Host: mobile-clients-api.rewe.de
Rd-Customer-Zip: 67065
Rd-Is-Lsfk: false
Rd-Market-Id: 831002
Rd-Postcode: 67065
Rd-Service-Types: PICKUP
Rdfa: 3d85e18e-d6df-4f53-8e71-4b3d68c1b3ee
User-Agent: REWE-Mobile-Client/3.18.5.33032 Android/14 Phone/Samsung_SM-S911B
X-Rd-Customer-Zip: 
X-Rd-Market-Id: 
X-Rd-Service-Types: UNKNOWN

some of these headers are optional, this is just what I generate currently hey, its not working for me, im getting: 400 Bad Request, is it working for you fetching products?

ByteSizedMarius commented 3 months ago

so how should the request look like?

URL:
https://mobile-clients-api.rewe.de/api/products?categorySlug=grillsaison&objectsPerPage=30&page=1&query=

Headers:
A-B-Test-Groups: productlist-citrusad
Connection: Keep-Alive
Correlation-Id: 03c04a7f-f3b2-45e7-a015-168f672c7341
Host: mobile-clients-api.rewe.de
Rd-Customer-Zip: 67065
Rd-Is-Lsfk: false
Rd-Market-Id: 831002
Rd-Postcode: 67065
Rd-Service-Types: PICKUP
Rdfa: 3d85e18e-d6df-4f53-8e71-4b3d68c1b3ee
User-Agent: REWE-Mobile-Client/3.18.5.33032 Android/14 Phone/Samsung_SM-S911B
X-Rd-Customer-Zip: 
X-Rd-Market-Id: 
X-Rd-Service-Types: UNKNOWN

some of these headers are optional, this is just what I generate currently hey, its not working for me, im getting: 400 Bad Request, is it working for you fetching products?

Yes, it's working. What is the response? It usually tells you what's wrong. Also don't reuse my rdfa/correlation-id.

Bit-Barron commented 3 months ago

so how should the request look like?

URL:
https://mobile-clients-api.rewe.de/api/products?categorySlug=grillsaison&objectsPerPage=30&page=1&query=

Headers:
A-B-Test-Groups: productlist-citrusad
Connection: Keep-Alive
Correlation-Id: 03c04a7f-f3b2-45e7-a015-168f672c7341
Host: mobile-clients-api.rewe.de
Rd-Customer-Zip: 67065
Rd-Is-Lsfk: false
Rd-Market-Id: 831002
Rd-Postcode: 67065
Rd-Service-Types: PICKUP
Rdfa: 3d85e18e-d6df-4f53-8e71-4b3d68c1b3ee
User-Agent: REWE-Mobile-Client/3.18.5.33032 Android/14 Phone/Samsung_SM-S911B
X-Rd-Customer-Zip: 
X-Rd-Market-Id: 
X-Rd-Service-Types: UNKNOWN

some of these headers are optional, this is just what I generate currently hey, its not working for me, im getting: 400 Bad Request, is it working for you fetching products?

Yes, it's working. What is the response? It usually tells you what's wrong. Also don't reuse my rdfa/correlation-id.

Maybe this is helping more thats my code written in py: hostname = "mobile-clients-api.rewe.de" url = f"https://mobile-clients-api.rewe.de/api/products?categorySlug=grillsaison&objectsPerPage=30&page=1&query=" rdfa_uuid = str(uuid.uuid4()) correlation_id_uuid = str(uuid.uuid4()) headers = { "user-agent": "REWE-Mobile-Client/3.17.1.32270 Android/11 Phone/Google_sdk_gphone_x86_64", "rdfa": rdfa_uuid, "Correlation-Id": correlation_id_uuid, "Host": hostname, "Connection": "Keep-Alive", "Accept-Encoding": "gzip" } and the response i get is: { "detail": "HTTP error: Client error '400 Bad Request' for url 'https://mobile-clients-api.rewe.de/api/products?categorySlug=grillsaison&objectsPerPage=30&page=1&query='\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400" }

ByteSizedMarius commented 3 months ago

so how should the request look like?

URL:
https://mobile-clients-api.rewe.de/api/products?categorySlug=grillsaison&objectsPerPage=30&page=1&query=

Headers:
A-B-Test-Groups: productlist-citrusad
Connection: Keep-Alive
Correlation-Id: 03c04a7f-f3b2-45e7-a015-168f672c7341
Host: mobile-clients-api.rewe.de
Rd-Customer-Zip: 67065
Rd-Is-Lsfk: false
Rd-Market-Id: 831002
Rd-Postcode: 67065
Rd-Service-Types: PICKUP
Rdfa: 3d85e18e-d6df-4f53-8e71-4b3d68c1b3ee
User-Agent: REWE-Mobile-Client/3.18.5.33032 Android/14 Phone/Samsung_SM-S911B
X-Rd-Customer-Zip: 
X-Rd-Market-Id: 
X-Rd-Service-Types: UNKNOWN

some of these headers are optional, this is just what I generate currently hey, its not working for me, im getting: 400 Bad Request, is it working for you fetching products?

Yes, it's working. What is the response? It usually tells you what's wrong. Also don't reuse my rdfa/correlation-id.

Maybe this is helping more thats my code written in py: hostname = "mobile-clients-api.rewe.de" url = f"https://mobile-clients-api.rewe.de/api/products?categorySlug=grillsaison&objectsPerPage=30&page=1&query=" rdfa_uuid = str(uuid.uuid4()) correlation_id_uuid = str(uuid.uuid4()) headers = { "user-agent": "REWE-Mobile-Client/3.17.1.32270 Android/11 Phone/Google_sdk_gphone_x86_64", "rdfa": rdfa_uuid, "Correlation-Id": correlation_id_uuid, "Host": hostname, "Connection": "Keep-Alive", "Accept-Encoding": "gzip" } and the response i get is: { "detail": "HTTP error: Client error '400 Bad Request' for url 'https://mobile-clients-api.rewe.de/api/products?categorySlug=grillsaison&objectsPerPage=30&page=1&query='\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400" }

there are headers missing

Bit-Barron commented 3 months ago

so how should the request look like?

URL:
https://mobile-clients-api.rewe.de/api/products?categorySlug=grillsaison&objectsPerPage=30&page=1&query=

Headers:
A-B-Test-Groups: productlist-citrusad
Connection: Keep-Alive
Correlation-Id: 03c04a7f-f3b2-45e7-a015-168f672c7341
Host: mobile-clients-api.rewe.de
Rd-Customer-Zip: 67065
Rd-Is-Lsfk: false
Rd-Market-Id: 831002
Rd-Postcode: 67065
Rd-Service-Types: PICKUP
Rdfa: 3d85e18e-d6df-4f53-8e71-4b3d68c1b3ee
User-Agent: REWE-Mobile-Client/3.18.5.33032 Android/14 Phone/Samsung_SM-S911B
X-Rd-Customer-Zip: 
X-Rd-Market-Id: 
X-Rd-Service-Types: UNKNOWN

some of these headers are optional, this is just what I generate currently hey, its not working for me, im getting: 400 Bad Request, is it working for you fetching products?

Yes, it's working. What is the response? It usually tells you what's wrong. Also don't reuse my rdfa/correlation-id.

Maybe this is helping more thats my code written in py: hostname = "mobile-clients-api.rewe.de" url = f"https://mobile-clients-api.rewe.de/api/products?categorySlug=grillsaison&objectsPerPage=30&page=1&query=" rdfa_uuid = str(uuid.uuid4()) correlation_id_uuid = str(uuid.uuid4()) headers = { "user-agent": "REWE-Mobile-Client/3.17.1.32270 Android/11 Phone/Google_sdk_gphone_x86_64", "rdfa": rdfa_uuid, "Correlation-Id": correlation_id_uuid, "Host": hostname, "Connection": "Keep-Alive", "Accept-Encoding": "gzip" } and the response i get is: { "detail": "HTTP error: Client error '400 Bad Request' for url 'https://mobile-clients-api.rewe.de/api/products?categorySlug=grillsaison&objectsPerPage=30&page=1&query='\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400" }

there are headers missing

thank you its working now !

Bit-Barron commented 2 months ago

so how should the request look like?

URL:
https://mobile-clients-api.rewe.de/api/products?categorySlug=grillsaison&objectsPerPage=30&page=1&query=

Headers:
A-B-Test-Groups: productlist-citrusad
Connection: Keep-Alive
Correlation-Id: 03c04a7f-f3b2-45e7-a015-168f672c7341
Host: mobile-clients-api.rewe.de
Rd-Customer-Zip: 67065
Rd-Is-Lsfk: false
Rd-Market-Id: 831002
Rd-Postcode: 67065
Rd-Service-Types: PICKUP
Rdfa: 3d85e18e-d6df-4f53-8e71-4b3d68c1b3ee
User-Agent: REWE-Mobile-Client/3.18.5.33032 Android/14 Phone/Samsung_SM-S911B
X-Rd-Customer-Zip: 
X-Rd-Market-Id: 
X-Rd-Service-Types: UNKNOWN

some of these headers are optional, this is just what I generate currently hey, its not working for me, im getting: 400 Bad Request, is it working for you fetching products?

Yes, it's working. What is the response? It usually tells you what's wrong. Also don't reuse my rdfa/correlation-id.

Maybe this is helping more thats my code written in py: hostname = "mobile-clients-api.rewe.de" url = f"https://mobile-clients-api.rewe.de/api/products?categorySlug=grillsaison&objectsPerPage=30&page=1&query=" rdfa_uuid = str(uuid.uuid4()) correlation_id_uuid = str(uuid.uuid4()) headers = { "user-agent": "REWE-Mobile-Client/3.17.1.32270 Android/11 Phone/Google_sdk_gphone_x86_64", "rdfa": rdfa_uuid, "Correlation-Id": correlation_id_uuid, "Host": hostname, "Connection": "Keep-Alive", "Accept-Encoding": "gzip" } and the response i get is: { "detail": "HTTP error: Client error '400 Bad Request' for url 'https://mobile-clients-api.rewe.de/api/products?categorySlug=grillsaison&objectsPerPage=30&page=1&query='\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400" }

there are headers missing

Hey, it's me again, my website is finished, now I get problems when I want to fetch the products on the server, I get with the same api that works locally “{ “detail": ‘HTTP error: Client error ’403 Forbidden‘ for url ’https://mobile-clients-api.rewe.de/api/products?categorySlug=regional&objectsPerPage=25&page=1&query=*&sorting=TOPSELLER_DESC'\nFor more information check: https://httpstatuses.com/403” }” can this be because my IP is not whitelisted, or is it because my cert and key are not working properly

torbenpfohl commented 2 months ago

I don't think there is a whitelist.. Can you share your query-code?

Bit-Barron commented 2 months ago

I don't think there is a whitelist.. Can you share your query-code?

Hello, thats my py code:import os import json import uuid import httpx from pathlib import Path from fastapi import APIRouter, HTTPException, Query

from get_creds import get_creds

router = APIRouter()

PRIVATE_KEY_FILENAME = "private.key" CERTIFICATE_FILENAME = "private.pem" SOURCE_PATH = Path(file).resolve().parent FULL_KEY_FILE_PATH = os.path.join(SOURCE_PATH, PRIVATE_KEY_FILENAME) FULL_CERT_FILE_PATH = os.path.join(SOURCE_PATH, CERTIFICATE_FILENAME)

def fetch_products( query: str = "*", page: int = 1, objects_per_page: int = 30, sorting: str = "TOPSELLER_DESC", categorySlug: str = "regional", filter: str = "ALL" # Neuer Parameter für den Filter ): filters = [] if filter == "VEGAN": filters.append("isVegan:true") elif filter == "VEGETARIAN": filters.append("isVegetarian:true") elif filter == "ORGANIC": filters.append("isOrganic:true") elif filter == "REGIONAL": filters.append("isRegional:true")

filter_query = " AND ".join(filters) if filters else ""

files = os.listdir(SOURCE_PATH)
if PRIVATE_KEY_FILENAME not in files or CERTIFICATE_FILENAME not in files:
    get_creds(source_path=SOURCE_PATH, key_filename=PRIVATE_KEY_FILENAME, cert_filename=CERTIFICATE_FILENAME)

client_cert = FULL_CERT_FILE_PATH
client_key = FULL_KEY_FILE_PATH

# Updated URL to include dynamic categorySlug
url = (
    f"https://mobile-clients-api.rewe.de/api/products?"
    f"categorySlug={categorySlug}&"
    f"objectsPerPage={objects_per_page}&"
    f"page={page}&"
    f"query={query}&"
    f"sorting={sorting}"
    f"{'&filter=' + filter_query if filter_query else ''}"
)
headers = {
    "A-B-Test-Groups": "productlist-citrusad",
    "Connection": "Keep-Alive",
    "Correlation-Id": "03c04a7f-f3b2-45e7-a015-168f672c7341",
    "Host": "mobile-clients-api.rewe.de",
    "Rd-Customer-Zip": "67065",
    "Rd-Is-Lsfk": "false",
    "Rd-Market-Id": "831002",
    "Rd-Postcode": "67065",
    "Rd-Service-Types": "PICKUP",
    "Rdfa": "3d85e18e-d6df-4f53-8e71-4b3d68c1b3ee",
    "User-Agent": "REWE-Mobile-Client/3.18.5.33032 Android/14 Phone/Samsung_SM-S911B",
    "X-Rd-Customer-Zip": "",
    "X-Rd-Market-Id": "",
    "X-Rd-Service-Types": "UNKNOWN"
}

try:
    with httpx.Client(http2=True, cert=(client_cert, client_key), headers=headers) as client:
        res = client.get(url)
        res.raise_for_status()
        return res.json()
except httpx.RequestError as e:
    raise HTTPException(status_code=500, detail=f"Request error: {e}")
except httpx.HTTPStatusError as e:
    raise HTTPException(status_code=500, detail=f"HTTP error: {e}")

@router.get("/products") def get_products( query: str = Query("*", description="Search query for products"), page: int = Query(1, description="Page number"), objects_per_page: int = Query(25, description="Number of products per page"), sorting: str = Query("TOPSELLER_DESC", description="Sorting method"), categorySlug: str = Query("regional", description="Category slug for filtering products"), filter: str = Query("ALL", description="Filter by attributes like vegan, organic, etc.") ): return fetch_products(query, page, objects_per_page, sorting, categorySlug, filter)

Bit-Barron commented 2 months ago

I don't think there is a whitelist.. Can you share your query-code?

Hello, thats my py code:import os import json import uuid import httpx from pathlib import Path from fastapi import APIRouter, HTTPException, Query

from get_creds import get_creds

router = APIRouter()

PRIVATE_KEY_FILENAME = "private.key" CERTIFICATE_FILENAME = "private.pem" SOURCE_PATH = Path(file).resolve().parent FULL_KEY_FILE_PATH = os.path.join(SOURCE_PATH, PRIVATE_KEY_FILENAME) FULL_CERT_FILE_PATH = os.path.join(SOURCE_PATH, CERTIFICATE_FILENAME)

def fetch_products( query: str = "*", page: int = 1, objects_per_page: int = 30, sorting: str = "TOPSELLER_DESC", categorySlug: str = "regional", filter: str = "ALL" # Neuer Parameter für den Filter ): filters = [] if filter == "VEGAN": filters.append("isVegan:true") elif filter == "VEGETARIAN": filters.append("isVegetarian:true") elif filter == "ORGANIC": filters.append("isOrganic:true") elif filter == "REGIONAL": filters.append("isRegional:true")

filter_query = " AND ".join(filters) if filters else ""

files = os.listdir(SOURCE_PATH)
if PRIVATE_KEY_FILENAME not in files or CERTIFICATE_FILENAME not in files:
    get_creds(source_path=SOURCE_PATH, key_filename=PRIVATE_KEY_FILENAME, cert_filename=CERTIFICATE_FILENAME)

client_cert = FULL_CERT_FILE_PATH
client_key = FULL_KEY_FILE_PATH

# Updated URL to include dynamic categorySlug
url = (
    f"https://mobile-clients-api.rewe.de/api/products?"
    f"categorySlug={categorySlug}&"
    f"objectsPerPage={objects_per_page}&"
    f"page={page}&"
    f"query={query}&"
    f"sorting={sorting}"
    f"{'&filter=' + filter_query if filter_query else ''}"
)
headers = {
    "A-B-Test-Groups": "productlist-citrusad",
    "Connection": "Keep-Alive",
    "Correlation-Id": "03c04a7f-f3b2-45e7-a015-168f672c7341",
    "Host": "mobile-clients-api.rewe.de",
    "Rd-Customer-Zip": "67065",
    "Rd-Is-Lsfk": "false",
    "Rd-Market-Id": "831002",
    "Rd-Postcode": "67065",
    "Rd-Service-Types": "PICKUP",
    "Rdfa": "3d85e18e-d6df-4f53-8e71-4b3d68c1b3ee",
    "User-Agent": "REWE-Mobile-Client/3.18.5.33032 Android/14 Phone/Samsung_SM-S911B",
    "X-Rd-Customer-Zip": "",
    "X-Rd-Market-Id": "",
    "X-Rd-Service-Types": "UNKNOWN"
}

try:
    with httpx.Client(http2=True, cert=(client_cert, client_key), headers=headers) as client:
        res = client.get(url)
        res.raise_for_status()
        return res.json()
except httpx.RequestError as e:
    raise HTTPException(status_code=500, detail=f"Request error: {e}")
except httpx.HTTPStatusError as e:
    raise HTTPException(status_code=500, detail=f"HTTP error: {e}")

@router.get("/products") def get_products( query: str = Query("*", description="Search query for products"), page: int = Query(1, description="Page number"), objects_per_page: int = Query(25, description="Number of products per page"), sorting: str = Query("TOPSELLER_DESC", description="Sorting method"), categorySlug: str = Query("regional", description="Category slug for filtering products"), filter: str = Query("ALL", description="Filter by attributes like vegan, organic, etc.") ): return fetch_products(query, page, objects_per_page, sorting, categorySlug, filter)

like exactly that code is working locally and on my vps not

Bit-Barron commented 2 months ago

I don't think there is a whitelist.. Can you share your query-code?

i also build a curl request thats working locally and on the vps not curl -X GET "https://mobile-clients-api.rewe.de/api/products?categorySlug=regional&objectsPerPage=30&page=1&query=*&sorting=TOPSELLER_DESC" -H "A-B-Test-Groups: productlist-citrusad" -H "Connection: Keep-Alive" -H "Correlation-Id: 03c04a7f-f3b2-45e7-a015-168f672c7341" -H "Host: mobile-clients-api.rewe.de" -H "Rd-Customer-Zip: 67065" -H "Rd-Is-Lsfk: false" -H "Rd-Market-Id: 831002" -H "Rd-Postcode: 67065" -H "Rd-Service-Types: PICKUP" -H "Rdfa: 3d85e18e-d6df-4f53-8e71-4b3d68c1b3ee" -H "User-Agent: REWE-Mobile-Client/3.18.5.33032 Android/14 Phone/Samsung_SM-S911B" -H "X-Rd-Customer-Zip: " -H "X-Rd-Market-Id: " -H "X-Rd-Service-Types: UNKNOWN" --cert private.pem --key private.key

torbenpfohl commented 2 months ago

Can you show what the curl request (with --verbose added) on your VPS prints out?

Bit-Barron commented 2 months ago

--verbose added

yes give me a min

Bit-Barron commented 2 months ago

--verbose

curl request: curl -X GET "https://mobile-clients-api.rewe.de/api/products?categorySlug=regional&objectsPerPage=30&page=1&query=*&sorting=TOPSELLER_DESC" -H "A-B-Test-Groups: productlist-citrusad" -H "Connection: Keep-Alive" -H "Correlation-Id: 03c04a7f-f3b2-45e7-a015-168f672c7341" -H "Host: mobile-clients-api.rewe.de" -H "Rd-Customer-Zip: 67065" -H "Rd-Is-Lsfk: false" -H "Rd-Market-Id: 831002" -H "Rd-Postcode: 67065" -H "Rd-Service-Types: PICKUP" -H "Rdfa: 3d85e18e-d6df-4f53-8e71-4b3d68c1b3ee" -H "User-Agent: REWE-Mobile-Client/3.18.5.33032 Android/14 Phone/Samsung_SM-S911B" -H "X-Rd-Customer-Zip: " -H "X-Rd-Market-Id: " -H "X-Rd-Service-Types: UNKNOWN" --cert private.pem --key private.key --verbose

Bit-Barron commented 2 months ago

--verbose

curl request: curl -X GET "https://mobile-clients-api.rewe.de/api/products?categorySlug=regional&objectsPerPage=30&page=1&query=*&sorting=TOPSELLER_DESC" -H "A-B-Test-Groups: productlist-citrusad" -H "Connection: Keep-Alive" -H "Correlation-Id: 03c04a7f-f3b2-45e7-a015-168f672c7341" -H "Host: mobile-clients-api.rewe.de" -H "Rd-Customer-Zip: 67065" -H "Rd-Is-Lsfk: false" -H "Rd-Market-Id: 831002" -H "Rd-Postcode: 67065" -H "Rd-Service-Types: PICKUP" -H "Rdfa: 3d85e18e-d6df-4f53-8e71-4b3d68c1b3ee" -H "User-Agent: REWE-Mobile-Client/3.18.5.33032 Android/14 Phone/Samsung_SM-S911B" -H "X-Rd-Customer-Zip: " -H "X-Rd-Market-Id: " -H "X-Rd-Service-Types: UNKNOWN" --cert private.pem --key private.key --verbose

response: Note: Unnecessary use of -X or --request, GET is already inferred.

Bit-Barron commented 2 months ago

Can you show what the curl request (with --verbose added) on your VPS prints out?

do you know how to fix it?

torbenpfohl commented 2 months ago

Not sure, yet. nginx related among other things, ipv4 related This is what I found so far, so first thing you might try is using IPv4 (--ipv4). Maybe if that's not working try using http1.1 (--http1.1).

Bit-Barron commented 2 months ago

Not sure, yet. nginx related among other things, ipv4 related This is what I found so far, so first thing you might try is using IPv4 (--ipv4). Maybe if that's not working try using http1.1 (--http1.1).

okay i will try that, thank you

Bit-Barron commented 2 months ago

Not sure, yet. nginx related among other things, ipv4 related This is what I found so far, so first thing you might try is using IPv4 (--ipv4). Maybe if that's not working try using http1.1 (--http1.1).

but if its not working, are there other methods where i can host my py backend to use it?

eikaramba commented 2 months ago

update: ok i saw now that you are already able to fetch the API just not on the VPS, in that case ignore my comment

not sure if it helps but i am able to fetch the products (in my forked heissePreise repo) via:

// reweClient.js

const https = require("https");
const fs = require("fs");
const crypto = require("crypto");
const zlib = require("zlib");

const FULL_CERT_FILE_PATH = "private.pem";
const FULL_KEY_FILE_PATH = "private.key";
const client_cert = fs.readFileSync(FULL_CERT_FILE_PATH);
const client_key = fs.readFileSync(FULL_KEY_FILE_PATH);

class ReweClient {
    constructor(marketId) {
        this.hostname = "mobile-clients-api.rewe.de";
        this.marketId = marketId;
    }

    async fetch(url) {
        // const url = `/api/stationary-app-offers/${marketId}${path}`;
        const options = this._getRequestOptions(url);

        return new Promise((resolve, reject) => {
            const req = https.request(options, (res) => {
                let data = [];

                res.on("data", (chunk) => {
                    data.push(chunk);
                });

                res.on("end", () => {
                    const buffer = Buffer.concat(data);
                    this._processResponse(buffer, res.headers, resolve, reject);
                });
            });

            req.on("error", (error) => {
                reject(error);
            });

            req.end();
        });
    }

    _getRequestOptions(url) {
        return {
            hostname: this.hostname,
            port: 443,
            path: url,
            method: "GET",
            cert: client_cert,
            key: client_key,
            headers: {
                "user-agent": "REWE-Mobile-Client/3.17.1.32270 Android/11 Phone/Google_sdk_gphone_x86_64",
                rdfa: crypto.randomUUID(),
                "Correlation-Id": crypto.randomUUID(),
                "rd-service-types": "PICKUP",
                "x-rd-service-types": "PICKUP",
                "rd-is-lsfk": "false",
                "rd-customer-zip": "48149",
                "rd-postcode": "48149",
                "x-rd-customer-zip": "",
                "rd-market-id": this.marketId,
                "x-rd-market-id": "",
                "a-b-test-groups": "productlist-citrusad",
                Host: this.hostname,
                Connection: "Keep-Alive",
                "Accept-Encoding": "gzip",
            },
        };
    }

    _processResponse(buffer, headers, resolve, reject) {
        if (headers["content-encoding"] === "gzip") {
            zlib.gunzip(buffer, (err, decoded) => {
                if (err) {
                    reject(new Error("Error decompressing data: " + err.message));
                    return;
                }
                this._parseAndResolve(decoded, resolve, reject);
            });
        } else {
            this._parseAndResolve(buffer, resolve, reject);
        }
    }

    _parseAndResolve(data, resolve, reject) {
        try {
            const jsonData = JSON.parse(data.toString());
            resolve(jsonData);
        } catch (e) {
            reject(new Error("Error parsing JSON: " + e.message));
        }
    }
}
module.exports = ReweClient;

and then

const client = new ReweClient(1940221);
const firstPage = await client.fetch(`/api/products?query=*&page=${pageId++}&objectsPerPage=250`);
Bit-Barron commented 2 months ago

not sure if it helps but i am able to fetch the products (in my forked heissePreise repo) via:

// reweClient.js

const https = require("https");
const fs = require("fs");
const crypto = require("crypto");
const zlib = require("zlib");

const FULL_CERT_FILE_PATH = "private.pem";
const FULL_KEY_FILE_PATH = "private.key";
const client_cert = fs.readFileSync(FULL_CERT_FILE_PATH);
const client_key = fs.readFileSync(FULL_KEY_FILE_PATH);

class ReweClient {
    constructor(marketId) {
        this.hostname = "mobile-clients-api.rewe.de";
        this.marketId = marketId;
    }

    async fetch(url) {
        // const url = `/api/stationary-app-offers/${marketId}${path}`;
        const options = this._getRequestOptions(url);

        return new Promise((resolve, reject) => {
            const req = https.request(options, (res) => {
                let data = [];

                res.on("data", (chunk) => {
                    data.push(chunk);
                });

                res.on("end", () => {
                    const buffer = Buffer.concat(data);
                    this._processResponse(buffer, res.headers, resolve, reject);
                });
            });

            req.on("error", (error) => {
                reject(error);
            });

            req.end();
        });
    }

    _getRequestOptions(url) {
        return {
            hostname: this.hostname,
            port: 443,
            path: url,
            method: "GET",
            cert: client_cert,
            key: client_key,
            headers: {
                "user-agent": "REWE-Mobile-Client/3.17.1.32270 Android/11 Phone/Google_sdk_gphone_x86_64",
                rdfa: crypto.randomUUID(),
                "Correlation-Id": crypto.randomUUID(),
                "rd-service-types": "PICKUP",
                "x-rd-service-types": "PICKUP",
                "rd-is-lsfk": "false",
                "rd-customer-zip": "48149",
                "rd-postcode": "48149",
                "x-rd-customer-zip": "",
                "rd-market-id": this.marketId,
                "x-rd-market-id": "",
                "a-b-test-groups": "productlist-citrusad",
                Host: this.hostname,
                Connection: "Keep-Alive",
                "Accept-Encoding": "gzip",
            },
        };
    }

    _processResponse(buffer, headers, resolve, reject) {
        if (headers["content-encoding"] === "gzip") {
            zlib.gunzip(buffer, (err, decoded) => {
                if (err) {
                    reject(new Error("Error decompressing data: " + err.message));
                    return;
                }
                this._parseAndResolve(decoded, resolve, reject);
            });
        } else {
            this._parseAndResolve(buffer, resolve, reject);
        }
    }

    _parseAndResolve(data, resolve, reject) {
        try {
            const jsonData = JSON.parse(data.toString());
            resolve(jsonData);
        } catch (e) {
            reject(new Error("Error parsing JSON: " + e.message));
        }
    }
}
module.exports = ReweClient;

and then const firstPage = await client.fetch(/api/products?query=*&page=${pageId++}&objectsPerPage=250);

is the script hostet somewhere?

eikaramba commented 2 months ago

Yes it is hosted in a docker container on my local server fetching prices every day from my home I. but not on a cloud server environment if that is the question.