Open DonaldTsang opened 5 years ago
@DonaldTsang
First of all, thank you for suggesting these interesting challenges! I have looked into it already and implemented some codes for (1) and (2). However, I found that my program was producing inconsistent results. For example, the regex
for certain elements (e.g. the validation token csrf
) cannot be found at times.
Upon further investigation, I realized that the problems were caused by the new UI changes that has been rolling out recently. For more details, please have a look at the readme
of this repository.
Due to the above issue, I have decided to pause any development of this project for now until the new UI is stable and released publicly. At the meantime, I will look into pixiv_scraper
. Thank you again for these challenges and I will certainly return to them when the new UI is available.
@Kent-Lee thanks for understanding the issue, and I respect that you paused dA until the UI is "safe". Do you think such a feature would be useful for art discovery? And do you think APIs are useful?
Also, on a side note: Do you have a Discord account? If not, Matrix/IRC handle?
@Kent-Lee please check the DMs in Discord, there are some hints as to how the UI changes can be fixed.
@DonaldTsang sorry for the long wait; I have been busy with other stuff lately. Anyways, the program is now updated for the new UI, so it should work like before. I will look into your suggestions soon (if there is no other major things happening in real life that is). Thank you.
Don't worry, hope that you are doing well in school/work. The programs is starting to take form. Also be aware that the Pixiv scraper's results should be similar to dA for ease of cross-checking.
And even better, is that now you can scrape people's "watching" list from the new "Eclipse"/dark mode!
https://www.deviantart.com/<username>/about#watching
e.g. https://www.deviantart.com/tonibabelony/about#watching
and repeatedly clicking the <button>
within <div>
inside <div>
inside <div id="watching">
, then get all the <a>
inside <span>
inside <div>
and <div>
and <div>
before inside the same <div id="watching">
From that we can pull some tricks from "Twitter Following Graphs" (who follows who on Twitter) and rank people based on what they liked (link prediction and community detection).
https://www.deviantart.com/<user>/favourites/
https://www.deviantart.com/<user>/favourites/<collectionID>