City-Bureau / city-scrapers-fresno

City Scrapers for Fresno
MIT License
2 stars 3 forks source link

#0018 spider needs help #96

Closed haileyhoyat closed 2 years ago

haileyhoyat commented 2 years ago

@jpt-c

this one looks easy but the table is not organized by row, it's organized by column. Each div element represents a column in the table, so there's no easy way to connect each item for a meeting row (i.e. the meeting date to its associated meeting agenda since the meeting date and agenda link are in separate columns).

i was thinking to count how many rows are in the first column (which will indicate total meeting items), and then in parse() specify the index number of the row the scrape from. ideas?

if you know exactly how to do this one and are like, "omg, I can finish this super fast," please feel free to take this scraper and go for it; the priority is to knock scrapers off asap at this point.

Screen Shot 2022-09-27 at 11 22 57 PM

Summary

Issue: #0018

Replace "ISSUE_NUMBER" with the number of your issue so that GitHub will link this pull request with the issue and make review easier.

Checklist

All checks are run in GitHub Actions. You'll be able to see the results of the checks at the bottom of the pull request page after it's been opened, and you can click on any of the specific checks listed to see the output of each step and debug failures.

Questions

Include any questions you have about what you're working on.

ghost commented 2 years ago

Just took a look over this.

Your approach looks sound to me.

Another approach would be to use zip(), it lets you regroup iterables. Given 3 iterables (which could be the three columns), you'd use it like this:

for x,y,z in zip("123", "ABC", "WWW"):
   print(x, y, z)

1 A W
2 B W
3 C W

Just replacing 123, ABC, and WWW with the lists from querying the three columns.

Agree on priority being to tackle more, if this one gets too complex I'd move on.