VertNet / gulo

Shredding Darwin Core Archives with ferocity, strength, and Cascalog.
7 stars 5 forks source link

IPT upgrade breaks page parsing for count field #103

Closed robinkraft closed 11 years ago

robinkraft commented 11 years ago

Per @laurarussell's comment on the recent upgrade of IPT software, get-count must be updated to properly parse the HTML of the new resource page to extract the record count.

eightysteele commented 11 years ago

+1... ooooor maybe we store counts in cdb to avoid a version dep on how we parse counts. @laurarussell, maybe?

On Mon, Sep 9, 2013 at 12:56 PM, Robin Kraft notifications@github.comwrote:

Per @laurarussell https://github.com/laurarussell's commenthttps://github.com/VertNet/webapp/issues/353#issuecomment-24106989on the recent upgrade of IPT software, get-counthttps://github.com/VertNet/gulo/blob/8894a3dc94d5bedfe5688998e6f69c93e9ef09ff/src/clj/gulo/harvest.clj#L132-151must be updated to properly parse the HTML of the new resource page to extract the record count.

— Reply to this email directly or view it on GitHubhttps://github.com/VertNet/gulo/issues/103 .

robinkraft commented 11 years ago

+1 we'd only have to change the # when re-harvesting a resource.

On Sep 9, 2013, at 3:39 PM, Aaron Steele notifications@github.com wrote:

+1... ooooor maybe we store counts in cdb to avoid a version dep on how we parse counts. @laurarussell, maybe?

On Mon, Sep 9, 2013 at 12:56 PM, Robin Kraft notifications@github.comwrote:

Per @laurarussell https://github.com/laurarussell's commenthttps://github.com/VertNet/webapp/issues/353#issuecomment-24106989on the recent upgrade of IPT software, get-counthttps://github.com/VertNet/gulo/blob/8894a3dc94d5bedfe5688998e6f69c93e9ef09ff/src/clj/gulo/harvest.clj#L132-151must be updated to properly parse the HTML of the new resource page to extract the record count.

— Reply to this email directly or view it on GitHubhttps://github.com/VertNet/gulo/issues/103 .

— Reply to this email directly or view it on GitHub.