Closed feesniped closed 12 years ago
I've put a comment on the file. Take a look.
Turns out all you have to do is remove the "/' from line 56. Should look like this: complete_url = self.url + str(i) + '?ax=1&ts='+ str(time.time())
Finally had a chance to look at it. My script works for me. It properly pulled 3 pages worth from the popular page? It wouldn't make sense to remove the '/' since the links are : http://hypem.com/popular/3
I've never scraped the popuar feed, but I will explain my logic. In like 56 you attempt (or so it seems) to define complete_url: complete_url = self.url + "/" + str(i) + '?ax=1&ts='+ str(time.time()).
If you then look at line 49, you clearly append a '/' at the beginning AND the end of the user-defined AREA_TO_SCRAPE. So it at least appears that you create the url with a '//' instead of a single '/.'
Perhaps Hypem deals with the feed urls for popular and favorite feeds in different ways. Perhaps you could take it for a spin again and instead of the popular feed, set scrape area as your hypem username.
Oh. Just don't have AREA_TO_SCRAPE with an end ending slash. Don't modify the rest of the code On Mar 8, 2012 6:31 PM, "jedunnigan" < reply@reply.github.com> wrote:
I've never scraped the popuar feed, but I will explain my logic. In like 56 you attempt (or so it seems) to define complete_url: complete_url = self.url + "/" + str(i) + '?ax=1&ts='+ str(time.time()).
If you then look at line 49, you clearly append a '/' at the beginning AND the end of the user-defined AREA_TO_SCRAPE. So it at least appears that you create the url with a '//' instead of a single '/.'
Perhaps Hypem deals with the feed urls for popular and favorite feeds in different ways. Perhaps you could take it for a spin again and instead of the popular feed, set scrape area as your hypem username.
Reply to this email directly or view it on GitHub: https://github.com/fzakaria/HypeScript/pull/2#issuecomment-4405517
Yes, although I have to admit my reasoning was to future-proof the script. [I think] complete_url is better suited for additional functionality when it is more context-dependent.
Let me know what you think.