openhatch / oh-bugimporters

Bug importers for the OpenHatch project oh-mainline
https://oh-bugimporters.readthedocs.org/
GNU Affero General Public License v3.0
12 stars 28 forks source link

Create custom bug parser for GHC Trac #101

Closed ghost closed 10 years ago

ghost commented 10 years ago

Comment by paulproteus:

On bugs in the Trac bug tracker for the GHC project, the Bug.good_for_newcomers (AKA bitesize) field should be set to True if the Trac data has a key called

'difficulty'

whose value is

'Easy (less than 1 hour)'

Maybe we should do this with a custom bug parser. We could also make the web UI (even) more complex.


Comment by paulproteus:

This would be a pretty simple new BugParser.


Comment by paulproteus:

Now that we've extracted oh-bugimporters, this would be great for e.g. the upcoming sprint. Not going to get done for 0.12.02, though.


Comment by soonick5:

I have a few questions before I can start making some progress on this ticket:

  • Should I be looking at oh-mainline or oh-bugimporters?
  • Does the bug indexing happens on a cron job or is it manually triggered?
  • It would be helpful to know what is the current result and what is the expected result so I know when I can mark this as completed.

Thanks, Adrian


Comment by paulproteus:

On Sat, 16 Jun 2012, soonick5 wrote:

> > > I have a few questions before I can start making some progress on this ticket: > > - Should I be looking at oh-mainline or oh-bugimporters?

oh-bugimporters

> - Does the bug indexing happens on a cron job or is it manually triggered?

cron job

> - It would be helpful to know what is the current result and what is the > expected result so I know when I can mark this as completed.

The currnt result is that when you parse a Trac bug, the 'difficulty' key from the data table is ignored.

The expected result is that the 'difficulty' field can be used to infer the value of "good_for_newcomers".

An easy implementation is to create a TracBugParser subclass that overrides the extract_tracker_specific_data() method (I think that's what it's called; blanking on name of method at the moment). oh-bugimporters/bugimporters/bugzilla.py has sample BugParser subclasses. Unlike the Bugzilla ones, you'll subclass TracBugParser and put it in trac.py.

-- Asheesh.


Comment by soonick5:

Thanks a lot for your answers.

I started taking a look at the bug importers to make the suggested changes but now I realized that I don't know how to run a bug importer so I can test what it is doing.

Do you know how I can do a run of the bug importer or point me to documentation about how to test this?

Thanks, Adrian


Comment by paulproteus:

[18:40:59] <soonick5> Hello, does anyone know how can I execute a bug importer from oh-bugimporters? [18:42:12] <paulproteus> soonick5: Hi! [18:42:15] <paulproteus> Sorry I didn't reply on the ticket; busy day. [18:42:24] <paulproteus> Quickly, I think the best way is just to write a test in the test suite. [18:42:35] <paulproteus> There isn't a good stand-alone way to run individual bug importers yet. [18:42:49] <paulproteus> I want to migrate the bug importers to be more based on "Scrapy" so that's easier, but for now it's kind of hard. [18:42:57] <soonick5> Don't worry, I imagine you have your eyes in a lot of stuff [18:43:05] <paulproteus> So my recommendation as a coder is to write tests, and then run the test suite, and watch the test pass/fail. [18:43:13] <paulproteus> Note also that you can run the code the way the backend does. [18:43:19] <paulproteus> To do that, get a copy of oh-mainline, and "cd" into it. [18:43:25] <paulproteus> Make sure that it can find oh-bugimporters, and then: [18:43:35] <paulproteus> python manage.py customs_daily_tasks [18:43:40] <paulproteus> That, iirc, runs the bug importer. [18:44:24] <paulproteus> But because oh-mainline and oh-bugimporter are somewhat tightly-coupled, it's kind of a drag to test individual bug parsers/importers this way. [18:44:42] <soonick5> How do I know if oh-mainline can find oh-bugimporters? [18:45:25] <paulproteus> python manage.py shell [18:45:27] <paulproteus> import bugimporters [18:45:32] <paulproteus> If that works, then you're in business. [18:45:50] <paulproteus> There's some autodetect magic where oh-mainline looks at ../oh-bugimporters and if it finds it there, it adds it to the path. [18:48:12] <soonick5> ok, that seemed to work [18:48:20] <soonick5> so once I run: python manage.py customs_daily_tasks [18:48:43] <soonick5> where can I check that the bugs were actually imported? will that reflect in the database? [18:48:49] <paulproteus> Yeah, it'll be reflected in the database. [18:49:16] <paulproteus> The best way to see how it's reflected in the database is by: python manage.py shell_plus [18:49:32] <paulproteus> Play with the Bug class there. [18:50:56] <soonick5> Awesome. I think with all these information I can finally start playing with my task [18:50:58] <soonick5> thanks [18:52:56] <paulproteus> My pleasure! Thanks for pinging here.


Comment by soonick5:

Hello,

Thanks again for all your useful answers.

I followed all the instructions from our chat but when it came to play with the Bugs class I didn't know what to do.

I also tried looking at the data being updated in the database but I didn't know which table I needed to look at. In this same subject, a lot of the documentation talks about a MySQL database but the only database I see in my local is a SQLite database. Is this configured to be this way based on the environment or do I need to have a MySQL database that I currently don't have?

I did a very shallow check of what this does: python manage.py customs_daily_tasks and I didn't find where the actual importing of the code is made. Do you think you can point me to a file I can follow and move from it?

I am still pretty new to Python and Django so I may be asking some very dump questions. If I am taking too long to do something that should be really easy I would be happy to move a simpler task.

Thanks, Adrian


Comment by paulproteus:

Hi Adrian! Sorry about the mess.

Here's what I suggest: I'll write more documentation for oh-bugimporters so that this simple task is actually simple.

Having said that, I'll also answer your questions briefly here.

On Tue, 26 Jun 2012, soonick5 wrote:

> Hello, > > Thanks again for all your useful answers. > > I followed all the instructions from our chat but when it came to play > with the Bugs class I didn't know what to do.

It's a Django model, so the shell gives you an interface to list all the rows in the database that correspond to that model.

You can get a list via:

Bub.all_bugs.all()

> > I also tried looking at the data being updated in the database but I didn't know > which table I needed to look at. In this same subject, a lot of the > documentation talks about a MySQL database but the only database I see in my > local is a SQLite database. Is this configured to be this way based on the > environment or do I need to have a MySQL database that I currently don't have?

It's the sqlite database. We changed the defaults a year ago, or so, and now that sqlite is the default.

> > I did a very shallow check of what this does: python manage.py > customs_daily_tasks and I didn't find where the actual importing of the code is > made. Do you think you can point me to a file I can follow and move from it?

Yes! Will do shortly.

> I am still pretty new to Python and Django so I may be asking some very > dump questions. If I am taking too long to do something that should be > really easy I would be happy to move a simpler task.

I think we can make this task work. Shortly I'll publish better docs and give you a link in this ticket.


Comment by soonick5:

I ran:

$ python manage.py customs_daily_tasks $ python manage.py shell_plus

And then this on the shell:

>>> Bug.all_bugs.all()

And I got this:

[<Bug: title='0' project='GNOME-Do' projectlanguage='C#' description='...'>, <Bug: title='1' project='d29187f31e2e4ba1b2ac43a74d6a40bf' projectlanguage='C' description='...'>]


Comment by soonick5:

search_bug is the table where the bugs information come from. Apparently running

$ python manage.py customs_daily_tasks

Isn't updating the db


Comment by soonick5:

Investigating /home/adrian/Dev/oh-mainline/mysite/customs/management/commands/custom_daily_tasks.py it seems like I should have some data in my profile_citation table but it is empty


Comment by paulproteus:

Hi soonick5!

I suggest that you keep things simple for now and try a more focused approach for this change. Instead of looking at the Django code, focus on the contents of oh-bugimporters, and use the oh-bugimporters test suite to play with the bug import process.

In the future it will be easier to run individual bug importers and see what data they download... perhaps you can even help with that transition in the future. For now, I'd say stick to running the code via the test suite.

I wrote up some new documentation for the oh-bugimporters code. Can you read http://oh-bugimporters.readthedocs.org/ and let me know if that clarifies things?

Especially pay attention to the "The role of multiple BugParsers" discussion.

-- Asheesh.


Comment by soonick5:

Hello Asheesh,

Thanks for creating that documentation, it was very helpful,

I was finally able to write some code (patch attached) that I hope you can take a look and let me know your comments, but I also have some questions.

  • In other projects I have been we usually create doc blocks for every method we create, including tests. Is this something that people do in python?
  • I usually add a message to all my assertions so the reader of my code can understand what exactly I am testing. Is this discouraged? Is the syntax I used ok?
  • I added the new field and value we are looking for in the TrackModel mock, I imagine I need to add those same values somewhere else. Where?

Thanks, Adrian


Comment by soonick5:

Hello Asheesh,

Just checking if you have had time to check my patch?

Regards, Adrian


Comment by soonick5:

Hello Asheesh,

It's been some time since I submitted this patch and you haven't checked it. If you are too busy, do you know if there is someone else who can help me review this?

Thanks, Adrian


Comment by paulproteus:

I should have time to review this today. Thanks and sorry!


Comment by paulproteus:

Hi Adrian! Sorry about the long delay; I have been traveling, but I could have let you know about the delay sooner! I just kept thinking I'd have time to review it.

Your patch is going along the right track -- with just a few changes, it can land.

Actually -- by the way -- your solution is more elegant than what I had imagined. So let's stick with your way and just fix a few things.

First, cosmetic problems. Your top commit has "Wip" in the commit log message, as if you saved your work but didn't clean things up before submitting. I recommend using "git commit --amend" to change just a commit log message, and "git rebase -i origin/master" to change more than that.

The value "good_for_newcomers" is supposed to just have a Boolean value. There's no way you would necessarily know that, before-hand, but because of what it corresponds to in the database a bool is all you need. No need for the helpful (but verbose) string you have there.

Beyond that:

Good job adding the test data!

The way that TrackerModel objects are supposed to work, at the moment you can only select a single value for bitesized_type. So you should implement logic that isn't splitting them on "," but instead supports each value individually.

To make that work well, you should have a separate TrackerModel for the difficulty: easy option. To make that, just copy-paste the current TrackerModel and name it HaskellTrackerModel, and change the fields as needed (for example, change "bitesized_type" to "difficulty", etc.) in the new HaskellTrackerModel.

I think that should be it.

I said up-top this is more tidy than the way I was thinking -- I was initially imagining that we'd have to create a separate BugParser class for this case, but you showed that it can be tidily implemented within the current TracBugParser. That is awesome to see!

I'm no longer doing as much active travel, so I should be much more available for reviews.

Thanks!

-- Asheesh.


Comment by soonick5:

Hello Asheesh,

Thanks for taking the time to check my patch. I have a few questions about your comments, tough.

I am not sure what is the verbose string you are talking about. Can you tell me which one is it?

Also, I am a little confused about what the next steps are. What I understood is: Remove the modification I did to the TrackerModel(Mock) class on the init file and instead create a new one called HaskellTrackerModel. If this is correct...should I make HaskellTrackerModel inherit from TrackerModel instead of Mock and just overwrite "bitesized_type" and "bitesized_text" attributes?

Is this the only change I need to make?

Thanks, Adrian


Comment by paulproteus:

Hi Adrian,

Sorry about my lack of clarity earlier! To reply to your questions:

  • By the "verbose string"I mean this:

    ('The bug is considered ' + 'good_for_newcomers because the value of the difficulty ' + 'field is: "Easy (less than 1 hour)"')

Good question about precise next steps. To clarify that:

  • Remove the modification you did to TrackerModel(Mock)
  • Create a new class called HaskellTrackerModel -- it's okay to subclass TrackerModel and just override the attributes you want to override. The subclassing idea is a good one, and didn't occcur to me!

Other mini next steps:

  • Adjust your test so that you refer to HaskellTrackerModel -- in test_bug_with_difficulty_easy_is_bitesize(), set the local variable "tm" to an instance of HaskellTrackerModel.
  • Amend your commit so it doesn't have "Wip" in the commit log message (although good on you for making work-in-progress commits -- those are helpful! It's just a good idea to clean that up before it lands in the main codebase).

That should be it. Thanks for your patience, and also for the code! Ping this ticket if you have any other questions.


Comment by soonick5:

Hello Asheesh,

Thanks for the instructions. I am attaching a new patch so you can give it a final review before I submit the pull request on github.

I like adding comments to my assertions so readers of the tests can more easily understand what each assertion is doing. For that reason I didn't remove the comment from my assertion. If this goes against any standard you have or for any reason think it is not correct I can remove it.

Please take a look and let me know if everything is correct, and if it is I will make a pull request on github.

Thanks, Adrian


Comment by paulproteus:

Hi soonick5!

This patch has just two trivial problems:

  • + assert returned_data['good_for_newcomers'], '''The bug is considered

This line ends with a space character, this sort of "trailing whitespace" can cause merge conflicts down the road. So it's better to remove the space character from the end of the line.

Also, in the Mock, you write: "Haskell(GHK) tracker" -- actually, it's "GHC", not GHK.

Beyond that, it looks great. I would merge it as-is, and if you want to make the above changes before submitting the pull request, that'd be fine with me.

Thank you for your hard work!


Comment by soonick5:

Hello Asheesh,

Thanks for all your help. I just submitted the pull request after making the corrections you suggested: https://github.com/openhatch/oh-bugimporters/pull/9

Regards, Adrian


Comment by soonick5:

Hello Asheesh,

I submitted my pull request a while ago and nobody has checked it. Can you please take a look and let me know if everything is correct.

I will try to connect to IRC one of these days to talk about the next steps.

Regards, Adrian Ancona


Comment by paulproteus:

I was looking for it, but didn't see it, and wondered if you hadn't submitted it.

Now I realize I was looking at the wrong repository! My gravest apologies; expect a review today!


Comment by paulproteus:

This change is merged in https://github.com/openhatch/oh-bugimporters/commit/9f8d267e2a02a726c4ffd4a214247ceba9026fb6

Thanks!


File at http://roundup-archive.openhatch.org/bugs/file514/0001-When-difficulty-field-is-Easy-less-than-1-hour-then-.patch by soonick5

Status: resolved Nosy List: mlinksva, paulproteus, soonick5 Priority: feature Imported from roundup ID: 626 (view archived page) Last modified: 2012-07-27.06:37:30