openhatch / oh-bugimporters

Bug importers for the OpenHatch project oh-mainline
https://oh-bugimporters.readthedocs.org/
GNU Affero General Public License v3.0
12 stars 28 forks source link

Use scrapy.items.Item to clarify data export; preserve project name; ditch bite_size_bug_name #3

Closed paulproteus closed 12 years ago

paulproteus commented 12 years ago

These commits clarify the data export interface -- see bugimporters/items.py for a spec of what data gets passed out through the data transport.

It also adds code that calculates a project name within the bug importer, rather than doing it within the data transit as before. This code mostly passes the tests when tested with oh-mainline, although to make it really pass you need a branch I'll be pushing momentarily. (Look for it on github.com/openhatch/oh-mainline )

Note that calculating the project name within the bug importer is essential to fixing a problem where http://openhatch.org/search/ lists many tasks as being within "GNOME Bugzilla". (Those are supposed to use a custom bug parser, but that custom bug parser's project name was being ignored.)

So, the questions really for this review are:

If so, please give me an ACK (:

shawnl commented 12 years ago

why do we have this same or similar dict over and over in the code?

why can't the get_parsed_data_dict() function be put in a utility file, and make to work for all the scrapers?

paulproteus commented 12 years ago

I don't agree with the claims of "out of order", but I'm open to hear how it is true.

The get_parsed_data_dict() is functionality that varies between the different bug importers; it's their common API. Hope that helps.

I think the move toward more Scrapy-ness will dramatically improve the cleanliness of this; this is one micro step toward that.

shawnl commented 12 years ago

fair enough, Just thought it should be mentioned.

Otherwise, looks good.