-
This is a big one, almost a mini-project in its own right.
Too many csvs have been hand edited to fix errors or mistakes where the scraper was lacking (ie the manual header row fixes for season, pl…
-
一些非标准库也很优秀,在操作比标准库方便不少;
比如抓取网页的requests;
比如xml解析的Beautiful Soup;
很多入门python的朋友,并不知道有这些好用的库;可考虑简单介绍这部分内容,作为指南;
me115 updated
10 years ago
-
Use a HTML compliant parser: In some doctypes (including HTML5) the `meta` tags do not need to be closed (like `) but the XML parser fails to read this tags. Also, some HTML entities are not recognize…
mandx updated
12 years ago
-
Hi *,
i was looking for a thredds client to browse and catalogs and to get resources (http, opendap, ...). I didn't find something working and i started to write my one based on already existing proj…
-
Hallo zusammen,
das wichtigste einmal vorab: Klasse Angebot mit suche-postleitzahl.org!
Für ein persönliches Projekt suche ich eine Liste, die den Zusammenhang (aller) deutschen Städte, deren Stad…
-
RTV currently performs OAuth by opening an external web browser, and having the user login to accept the OAuth request. This can lead to problems if the user doesn't have an external web browser insta…
-
This is more something that should be done long term in the product that I might contribute to doing eventually.
Right now you can run into problems where the transformation of one url can also chang…
-
Write scraping blog in portfolio
-
Hi, thx for the library, it is really helping. In race/race.py there is
`self._get_endpoint(endpoint=RaceEditionResults, l=classification_num, e=zero_padded_stage_num)`
And in constants.py I se…
-
Not getting any links as output