Closed pink-red closed 10 months ago
From the two posts I could not discern whether non-malicious scraping is allowed. Maybe robots.txt
can explain more...
Use of Cloudflare is kind of scary because they run on so many properties on the internet, they could easily fingerprint users.
Pixiv says they already have high false positives in some areas, so as to need holding back on measures, but Machine Learning may yield better targeted results.
I would argue you are more likely to have trouble with low reputation accounts, like throwaways!
whether non-malicious scraping is allowed
https://policies.pixiv.net/en.html
- Other prohibited acts
- Collection of information using crawlers and other such programs;
And I would be really surprised if it wasn't prohibited. Crawling is prohibited on most websites, and the remaining ones usually just don't care instead of explicitly allowing it. If a website wants you to interact with it programmatically, it will provide an official API and documentation.
Use of Cloudflare is kind of scary
Pixiv says that it's already used, so not that scary.
I would argue you are more likely to have trouble with low reputation accounts, like throwaways!
Having trouble with a throwaway account is better that unexpectedly getting your main account banned.
In any case, as with any announcement, take it with a grain of salt. We don't know yet, how well these measures will actually work and what could be done about them. I would say: take caution and don't use your main account, we'll see what happens next.
Both the web interface and the reverse-engineered mobile API (which is what gallery-dl uses) are using Cloudflare's bot management solutions. Interestingly, I've never seen users report that they get Cloudflare challenges when using gallery-dl, but this seems to happen from time to time for projects that are using headless browsers to access the endpoints (e.g. https://github.com/upbit/pixivpy/issues/259).
https://www.pixiv.net/info.php?id=9541
https://inside.pixiv.blog/2023/05/17/102629
Just wanted to share this, since this could eventually result in accounts being banned when using gallery-dl.
Always use throwaway accounts!