Open kaushikgandhi opened 10 years ago
User data will be added soon.
sure thanks .
hey karan, how would like the user data to be added?? what did u have in mind?? i can try to work on it :)
As long as the conventions in existing code are followed, go for it. :)
can i discuss any ideas/issues related to it over here....in case i have questions abt if my approach is correct??
Absolutely!
awesome...excited :)
where can we get the user data from? the only user data i saw was in https://news.ycombinator.com/leaders
I think we should have a method (like get_user
) where clients pass in a username (like karangoeluw
) and then we return the user data from https://news.ycombinator.com/user?id=karangoeluw
. Thoughts?
oh i see....i didn't know u can get user info like that? i will look into it and submit something by tonite :)
Sounds good. I'll go over the code tomorrow, and refactor stuff out as needed.
hey karan,
i am getting the following print when i try to get html page using beautiful soup:
We've limited requests for this url.
Do u know why this is so? i am used it for different users but still getting the same result
@ueg1990 check out the https://news.ycombinator.com/robots.txt robots.txt of hacker news it disallows you to read user urls ... but you can crawl every 30 [seconds] . Or else your ip can get banned . Better if you can implement it with http://api.thriftdb.com/api.hnsearch.com/users hnsearch api . They have done it wonderfully .And you don't need to bother about scrapping . And i use this successfully with my app .
Well the point of this API is to provide a pythonic interface for HN to native Python apps. But yeah, we need to control the requests we make somehow.
hey kaushik, just out of curiosity how do u use their api? do u just do: http://api.thriftdb.com/api.hnsearch.com/users?id= karangoeluw
@ueg1990 check the request formats here https://bitbucket.org/kaushikfrnd/hn-scraping/src/42c1da1a6fa85ed12559206819ef9bade808996b/thriftapi_request%20format?at=master you can also have a look over my code https://bitbucket.org/kaushikfrnd/hn-scraping . The idea was to store all posts in hacker news till date .
update the api to fetch user datas submissions , comments , karmas etc