Open TheTechRobo opened 2 years ago
Can reproduce that with those names, but as far as I can tell, none of them exist (or their profiles require logging in, perhaps?). Others work correctly. Random example: Angelinazhaoooo (though it crashes with a KeyError
on the video extraction very quickly).
The Twitter scraper also has an explicit flag, --user-id
. Automatic detection for that obviously breaks when someone has a username composed solely of digits.
Also, to dump on every WARNING
or higher, there is a global option: --dump-locals
(Yes, it should probably get a better name.)
The Twitter scraper also has an explicit flag, --user-id. Automatic detection for that obviously breaks when someone has a username composed solely of digits.
Oh, I guess that's true.
I could have sworn they existed when I loaded it up into a browser, but maybe I'm wrong. Sorry for opening this invalid issue, I guess.
Wait, https://weibo.com/qukean exists I think
Maybe, but that's behind a login wall. The mobile site, which is publicly accessible and therefore used by snscrape, says it doesn't exist: https://m.weibo.cn/n/qukean
Oh, I'm using weibo.com. Is that different? I don't have to login for weibo.com/qukean:
Yeah, it's not really a login, but it's an auth system of sorts with awful JS stuff to get cookies for accessing weibo.com (that I didn't want to reimplement). It is the same service though, so it's interesting that this profile is only accessible on weibo.com but not on m.weibo.cn.
Oh yeah, I noticed that redirect. Sounds very annoying to bypass or mimic.
Yeah, the only way to fix this would be to reimplement that auth flow. Not something I'll tackle anytime soon, I think.
It's only the name resolution which is the problem here, it seems. qukean is user ID 1223717857, and that works fine on the mobile site (and consequently with snscrape). The name resolution on weibo.com is still behind the same auth flow though, so this insight doesn't really change anything, but at least you can manually work around it by observing the user ID in the network monitor when loading the profile page and then using that.
Haven't tested with user IDs.
With verbose output (can't get locals because it didn,t crash; you should add an option to dump them anyway):
Also it seems really unintuitive to have to add --name as an option if it's not a user ID; could this be fixed like it was with the Twitter scraper, i.e. seeing if it's an int?