boberle / vinted-downloader

Download pages and pictures from Vinted
MIT License
9 stars 3 forks source link

No longer working #1

Closed Danomophone closed 11 months ago

Danomophone commented 11 months ago

Unfortunately it appears they have altered the code significantly. I've downloaded a couple of html files and it doesn't appear there are any links to high res images anymore.

May be dead project ;(

boberle commented 11 months ago

Vinted changes its html files every few months. It's not the first time...

I try to keep up and fix the script every time. I will have a look and try to fix the script.

boberle commented 11 months ago

Found it! :smile:

They have moved the json data from inside the html file to an external .json file that is downloaded alongside the html.

I will try to fix it today or during the week.

boberle commented 11 months ago

Getting the json file requires some auth cookies. I need to change the code a bit.

In the meantime, here is a bash script which will download the photos in full res (without save other informations like the python script does):

url=$1

item_id=`echo "$url" | grep -oP "(?<=/)\d+(?=-)"`

curl \
-H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8" \
-H "Accept-Encoding: gzip, deflate, br" \
-H "Accept-Language: fr-FR,fr;q=0.5" \
-H "Connection: keep-alive" \
-H "Sec-Fetch-Dest: document" \
-H "Sec-Fetch-Mode: navigate" \
-H "Sec-Fetch-Site: cross-site" \
-H "TE: trailers" \
-H "Upgrade-Insecure-Requests: 1" \
-H "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0" \
--cookie-jar "vinted_cookies.txt" \
--output vinted_home.out \
"https://www.vinted.fr"

curl \
-H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8" \
-H "Accept-Encoding: gzip, deflate, br" \
-H "Accept-Language: fr-FR,fr;q=0.5" \
-H "Connection: keep-alive" \
-H "Sec-Fetch-Dest: document" \
-H "Sec-Fetch-Mode: navigate" \
-H "Sec-Fetch-Site: cross-site" \
-H "TE: trailers" \
-H "Upgrade-Insecure-Requests: 1" \
-H "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0" \
--cookie "vinted_cookies.txt" \
--output vinted_item.out \
"https://www.vinted.fr/api/v2/items/$item_id?localize=false"

count=0
for photo_url in `cat vinted_item.out | gzip -d | jq -r ".item.photos[] | .full_size_url"`
do
   curl \
   -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8" \
   -H "Accept-Encoding: gzip, deflate, br" \
   -H "Accept-Language: fr-FR,fr;q=0.5" \
   -H "Connection: keep-alive" \
   -H "Sec-Fetch-Dest: document" \
   -H "Sec-Fetch-Mode: navigate" \
   -H "Sec-Fetch-Site: cross-site" \
   -H "Upgrade-Insecure-Requests: 1" \
   -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0" \
   --output "vinted_photo_$count.jpg" \
   $photo_url
   ((count++)) || true
done

Just save it in a file download_full_size_images.sh and call it with:

bash -e -x download_full_size_images.sh URL_OF_THE_ITEM
Danomophone commented 11 months ago

Thank you for that, I can't tell you how much I appreciate it :)

On Sun, Oct 1, 2023 at 9:02 PM Bruno Oberle @.***> wrote:

Getting the json file requires some auth cookies. I need to change the code a bit.

In the meantime, here is a bash script which will download the photos in full res (without save other informations like the python script does):

url=$1

item_id=echo "$1" | grep -oP "(?<=/)\d+(?=-)"

curl \ -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,/;q=0.8" \ -H "Accept-Encoding: gzip, deflate, br" \ -H "Accept-Language: fr-FR,fr;q=0.5" \ -H "Connection: keep-alive" \ -H "Sec-Fetch-Dest: document" \ -H "Sec-Fetch-Mode: navigate" \ -H "Sec-Fetch-Site: cross-site" \ -H "TE: trailers" \ -H "Upgrade-Insecure-Requests: 1" \ -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0" \ --cookie-jar "vinted_cookies.txt" \ --output vinted_home.out \"https://www.vinted.fr"

curl \ -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,/;q=0.8" \ -H "Accept-Encoding: gzip, deflate, br" \ -H "Accept-Language: fr-FR,fr;q=0.5" \ -H "Connection: keep-alive" \ -H "Sec-Fetch-Dest: document" \ -H "Sec-Fetch-Mode: navigate" \ -H "Sec-Fetch-Site: cross-site" \ -H "TE: trailers" \ -H "Upgrade-Insecure-Requests: 1" \ -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0" \ --cookie "vinted_cookies.txt" \ --output vinted_item.out \"https://www.vinted.fr/api/v2/items/$item_id?localize=false"

count=0for photo_url in cat vinted_item.out | gzip -d | jq -r ".item.photos[] | .full_size_url"do curl \ -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,/;q=0.8" \ -H "Accept-Encoding: gzip, deflate, br" \ -H "Accept-Language: fr-FR,fr;q=0.5" \ -H "Connection: keep-alive" \ -H "Sec-Fetch-Dest: document" \ -H "Sec-Fetch-Mode: navigate" \ -H "Sec-Fetch-Site: cross-site" \ -H "Upgrade-Insecure-Requests: 1" \ -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0" \ --output "vintedphoto$count.jpg" \ $photo_url ((count++)) || truedone

Just save it in a file download_full_size_images.sh and call it with:

bash -e -x download_full_size_images.sh URL_OF_THE_ITEM

— Reply to this email directly, view it on GitHub https://github.com/boberle/vinted-downloader/issues/1#issuecomment-1742036432, or unsubscribe https://github.com/notifications/unsubscribe-auth/BC5ICPUU6LGTD7CYDPJ6J6TX5FBE7ANCNFSM6AAAAAA5MKM6F4 . You are receiving this because you authored the thread.Message ID: @.***>

Danomophone commented 11 months ago

Oh, perhaps related - do you know if the sold item numbers are also indexed somewhere (relative to the actual username)? I believe that unless the user explicitly deletes it it remains as a viewable page. It occurs to me that the site likely populates the profiles by requesting all the items for a specific username with specific attributes, probably is_hidden:0 or is_closed:0, so surely one could do a request for all sold items etc. (This site is way more complicated than my level of web programming)

Danomophone commented 11 months ago

The items are definitely stored somewhere. I'm not able to find specific URLs related to the items in the user profiles, however it appears the tag for return is "is_visible" - for example, if you look up item 3244050098, the is_visible variable is set to 0, whereas a searchable item has it set at 1. The URL for the item is still a valid link and can be returned although the site has a "view sold item" challenge. I think from looking at the site structure that the item list is populated server side

https://www.vinted.com/api/v2/users/$userID/items

The basic URL does a request and in effect shows only about 50 items by default, but you can add the suffix ?page=1&per_page=96&order=relevance to get a maximum of 96 at a time. Theoretically perhaps a query for items that are not visible can be formed? Your thoughts?

boberle commented 11 months ago

I've updated the python code to take into account the changes. So I will close the issue. You can create a new one for the sold items if you want.

Regarding the sold items, they are kept, but they seem to be visible only by the user. I think the list of items for a given user includes the sold items only if you are viewing your own list of items, not someone else's. So I don't think you can get the list of sold items of a random user, even if you can view a sold item if you have the URL. I've done some tests using 2 vinted accounts, and was unable to find my sold items when not logged in or logged in with a secondary account.

The is_visible property doesn't refer to a sold item, but to an option the seller has to hide an item ("masquer" in French):

image