mvabdi / vsco-scraper

Easily allows for scraping a VSCO
MIT License
132 stars 25 forks source link

Trying to scrape everyone I follow #20

Closed sicr0 closed 2 years ago

sicr0 commented 3 years ago

Hi, I love the script. I was trying to add a functionality to scrape all the Following profiles. Where could I look for documentation?

mvabdi commented 2 years ago

Well there's documentation on the main page of this. Do you mean that you want to scrape a list of profiles, or profiles a user is following. I am pretty sure that VSCO does not expose who a user is following, so that would not work.

sicr0 commented 2 years ago

I would like to add to add a functionality to let the program log on my profile, scrape all the people I am following, and then scrape their content. Since it is my own account I am able to see who I am following

mvabdi commented 2 years ago

That is out of scope for this script. It currently uses no user cookies to view all images, as VSCO has no private images as far as I am aware.

Probably the best work around for you is using the feature to put the usernames of the people you'd like to view in a text file.

for example:

bob jack

would scrape users bob and jack

Use this

vsco-scraper "filename-of-text-file" --multiple

lazytownfan commented 2 years ago

By the way, do you need the double quotes " around the text file's name? It's because that doesn't appear in the documentation.

sicr0 commented 2 years ago

@mvabdi sorry for bothering you. I have like 200 pages worth of accounts, and they change username sometimes. Do you know about any API that could scrape all those usernames for me since doing it manually would take up a massive amount of time?

bamtan commented 2 years ago

I'll leave you with this project which has implemented scraping via site-id https://github.com/HuggableSquare/vsco-dl/commit/08e153b21f52e104d93d1314d067eeab2c721f35. But I can't confirm if this tool still works.

mvabdi commented 2 years ago

I already have in code that get's the site id's per users, so I could probably make a function to spit it out per username / spit out a textfile.

I think I will add it in the next release.

github-userx commented 2 years ago

Message ID: @.***>You rock @mvabdi ! For still keeping this tool alive and improving it after all this time! I remember the early days..

mvabdi commented 2 years ago

Thanks above ^

It did take a while, but I did add an option flag that lets you cache siteids. If you run -ch next to your original command, it should download regardless of the username.

The only issue would be if you want to scrape two different people of the same username, after one has changed their username. In that case, you would need to figure out what the username of the new person is.

If it becomes an issue, I will probably add a feature for that as well.