On bugs in the Trac bug tracker for the GHC project, the Bug.good_for_newcomers
(AKA bitesize) field should be set to True if the Trac data has a key called
'difficulty'
whose value is
'Easy (less than 1 hour)'
Maybe we should do this with a custom bug parser. We could also make the web UI
(even) more complex.
>
>
> I have a few questions before I can start making some progress on this ticket:
>
> - Should I be looking at oh-mainline or oh-bugimporters?
oh-bugimporters
> - Does the bug indexing happens on a cron job or is it manually triggered?
cron job
> - It would be helpful to know what is the current result and what is the
> expected result so I know when I can mark this as completed.
The currnt result is that when you parse a Trac bug, the 'difficulty' key
from the data table is ignored.
The expected result is that the 'difficulty' field can be used to infer
the value of "good_for_newcomers".
An easy implementation is to create a TracBugParser subclass that
overrides the extract_tracker_specific_data() method (I think that's what
it's called; blanking on name of method at the moment).
oh-bugimporters/bugimporters/bugzilla.py has sample BugParser subclasses.
Unlike the Bugzilla ones, you'll subclass TracBugParser and put it in
trac.py.
I started taking a look at the bug importers to make the suggested changes but
now I realized that I don't know how to run a bug importer so I can test what it
is doing.
Do you know how I can do a run of the bug importer or point me to documentation
about how to test this?
[18:40:59] <soonick5> Hello, does anyone know how can I execute a bug importer
from oh-bugimporters?
[18:42:12] <paulproteus> soonick5: Hi!
[18:42:15] <paulproteus> Sorry I didn't reply on the ticket; busy day.
[18:42:24] <paulproteus> Quickly, I think the best way is just to write a test
in the test suite.
[18:42:35] <paulproteus> There isn't a good stand-alone way to run individual
bug importers yet.
[18:42:49] <paulproteus> I want to migrate the bug importers to be more based on
"Scrapy" so that's easier, but for now it's kind of hard.
[18:42:57] <soonick5> Don't worry, I imagine you have your eyes in a lot of stuff
[18:43:05] <paulproteus> So my recommendation as a coder is to write tests, and
then run the test suite, and watch the test pass/fail.
[18:43:13] <paulproteus> Note also that you can run the code the way the
backend does.
[18:43:19] <paulproteus> To do that, get a copy of oh-mainline, and "cd" into it.
[18:43:25] <paulproteus> Make sure that it can find oh-bugimporters, and then:
[18:43:35] <paulproteus> python manage.py customs_daily_tasks
[18:43:40] <paulproteus> That, iirc, runs the bug importer.
[18:44:24] <paulproteus> But because oh-mainline and oh-bugimporter are somewhat
tightly-coupled, it's kind of a drag to test individual bug parsers/importers
this way.
[18:44:42] <soonick5> How do I know if oh-mainline can find oh-bugimporters?
[18:45:25] <paulproteus> python manage.py shell
[18:45:27] <paulproteus> import bugimporters
[18:45:32] <paulproteus> If that works, then you're in business.
[18:45:50] <paulproteus> There's some autodetect magic where oh-mainline looks
at ../oh-bugimporters and if it finds it there, it adds it to the path.
[18:48:12] <soonick5> ok, that seemed to work
[18:48:20] <soonick5> so once I run: python manage.py customs_daily_tasks
[18:48:43] <soonick5> where can I check that the bugs were actually imported?
will that reflect in the database?
[18:48:49] <paulproteus> Yeah, it'll be reflected in the database.
[18:49:16] <paulproteus> The best way to see how it's reflected in the database
is by: python manage.py shell_plus
[18:49:32] <paulproteus> Play with the Bug class there.
[18:50:56] <soonick5> Awesome. I think with all these information I can finally
start playing with my task
[18:50:58] <soonick5> thanks
[18:52:56] <paulproteus> My pleasure! Thanks for pinging here.
I followed all the instructions from our chat but when it came to play with the
Bugs class I didn't know what to do.
I also tried looking at the data being updated in the database but I didn't know
which table I needed to look at. In this same subject, a lot of the
documentation talks about a MySQL database but the only database I see in my
local is a SQLite database. Is this configured to be this way based on the
environment or do I need to have a MySQL database that I currently don't have?
I did a very shallow check of what this does: python manage.py
customs_daily_tasks and I didn't find where the actual importing of the code is
made. Do you think you can point me to a file I can follow and move from it?
I am still pretty new to Python and Django so I may be asking some very dump
questions. If I am taking too long to do something that should be really easy I
would be happy to move a simpler task.
Here's what I suggest: I'll write more documentation for oh-bugimporters
so that this simple task is actually simple.
Having said that, I'll also answer your questions briefly here.
On Tue, 26 Jun 2012, soonick5 wrote:
> Hello,
>
> Thanks again for all your useful answers.
>
> I followed all the instructions from our chat but when it came to play
> with the Bugs class I didn't know what to do.
It's a Django model, so the shell gives you an interface to list all the
rows in the database that correspond to that model.
You can get a list via:
Bub.all_bugs.all()
>
> I also tried looking at the data being updated in the database but I didn't know
> which table I needed to look at. In this same subject, a lot of the
> documentation talks about a MySQL database but the only database I see in my
> local is a SQLite database. Is this configured to be this way based on the
> environment or do I need to have a MySQL database that I currently don't have?
It's the sqlite database. We changed the defaults a year ago, or so, and
now that sqlite is the default.
>
> I did a very shallow check of what this does: python manage.py
> customs_daily_tasks and I didn't find where the actual importing of the code is
> made. Do you think you can point me to a file I can follow and move from it?
Yes! Will do shortly.
> I am still pretty new to Python and Django so I may be asking some very
> dump questions. If I am taking too long to do something that should be
> really easy I would be happy to move a simpler task.
I think we can make this task work. Shortly I'll publish better docs and
give you a link in this ticket.
Investigating
/home/adrian/Dev/oh-mainline/mysite/customs/management/commands/custom_daily_tasks.py
it seems like I should have some data in my profile_citation table but it is empty
I suggest that you keep things simple for now and try a more focused
approach for this change. Instead of looking at the Django code, focus on
the contents of oh-bugimporters, and use the oh-bugimporters test suite to
play with the bug import process.
In the future it will be easier to run individual bug importers and see
what data they download... perhaps you can even help with that transition
in the future. For now, I'd say stick to running the code via the test
suite.
I wrote up some new documentation for the oh-bugimporters code. Can you
read http://oh-bugimporters.readthedocs.org/ and let me know if that
clarifies things?
Especially pay attention to the "The role of multiple BugParsers"
discussion.
Thanks for creating that documentation, it was very helpful,
I was finally able to write some code (patch attached) that I hope you can take
a look and let me know your comments, but I also have some questions.
In other projects I have been we usually create doc blocks for every method we
create, including tests. Is this something that people do in python?
I usually add a message to all my assertions so the reader of my code can
understand what exactly I am testing. Is this discouraged? Is the syntax I used ok?
I added the new field and value we are looking for in the TrackModel mock, I
imagine I need to add those same values somewhere else. Where?
It's been some time since I submitted this patch and you haven't checked it. If
you are too busy, do you know if there is someone else who can help me review this?
Hi Adrian! Sorry about the long delay; I have been traveling, but I could have let you know about the delay sooner! I just kept
thinking I'd have time to review it.
Your patch is going along the right track -- with just a few changes, it can land.
Actually -- by the way -- your solution is more elegant than what I had imagined. So let's stick with your way and just fix a few
things.
First, cosmetic problems. Your top commit has "Wip" in the commit log message, as if you saved your work but didn't clean things up
before submitting. I recommend using "git commit --amend" to change just a commit log message, and "git rebase -i origin/master" to
change more than that.
The value "good_for_newcomers" is supposed to just have a Boolean value. There's no way you would necessarily know that, before-hand,
but because of what it corresponds to in the database a bool is all you need. No need for the helpful (but verbose) string you have
there.
Beyond that:
Good job adding the test data!
The way that TrackerModel objects are supposed to work, at the moment you can only select a single value for bitesized_type. So you
should implement logic that isn't splitting them on "," but instead supports each value individually.
To make that work well, you should have a separate TrackerModel for the difficulty: easy option. To make that, just copy-paste the
current TrackerModel and name it HaskellTrackerModel, and change the fields as needed (for example, change "bitesized_type" to
"difficulty", etc.) in the new HaskellTrackerModel.
I think that should be it.
I said up-top this is more tidy than the way I was thinking -- I was initially imagining that we'd have to create a separate
BugParser class for this case, but you showed that it can be tidily implemented within the current TracBugParser. That is awesome to
see!
I'm no longer doing as much active travel, so I should be much more available for reviews.
Thanks for taking the time to check my patch. I have a few questions about your
comments, tough.
I am not sure what is the verbose string you are talking about. Can you tell me
which one is it?
Also, I am a little confused about what the next steps are. What I understood
is: Remove the modification I did to the TrackerModel(Mock) class on the init
file and instead create a new one called HaskellTrackerModel. If this is
correct...should I make HaskellTrackerModel inherit from TrackerModel instead of
Mock and just overwrite "bitesized_type" and "bitesized_text" attributes?
Sorry about my lack of clarity earlier! To reply to your questions:
By the "verbose string"I mean this:
('The bug is considered ' + 'good_for_newcomers because the value of the
difficulty ' + 'field is: "Easy (less than 1 hour)"')
Good question about precise next steps. To clarify that:
Remove the modification you did to TrackerModel(Mock)
Create a new class called HaskellTrackerModel -- it's okay to subclass
TrackerModel and just override the attributes you want to override. The
subclassing idea is a good one, and didn't occcur to me!
Other mini next steps:
Adjust your test so that you refer to HaskellTrackerModel -- in
test_bug_with_difficulty_easy_is_bitesize(), set the local variable "tm" to an
instance of HaskellTrackerModel.
Amend your commit so it doesn't have "Wip" in the commit log message (although
good on you for making work-in-progress commits -- those are helpful! It's just a
good idea to clean that up before it lands in the main codebase).
That should be it. Thanks for your patience, and also for the code! Ping this
ticket if you have any other questions.
Thanks for the instructions. I am attaching a new patch so you can give it a
final review before I submit the pull request on github.
I like adding comments to my assertions so readers of the tests can more easily
understand what each assertion is doing. For that reason I didn't remove the
comment from my assertion. If this goes against any standard you have or for any
reason think it is not correct I can remove it.
Please take a look and let me know if everything is correct, and if it is I will
make a pull request on github.
+ assert returned_data['good_for_newcomers'], '''The bug is considered
This line ends with a space character, this sort of "trailing whitespace" can
cause merge conflicts down the road. So it's better to remove the space
character from the end of the line.
Also, in the Mock, you write: "Haskell(GHK) tracker" -- actually, it's "GHC",
not GHK.
Beyond that, it looks great. I would merge it as-is, and if you want to make the
above changes before submitting the pull request, that'd be fine with me.
Comment by paulproteus:
'difficulty'
whose value is
'Easy (less than 1 hour)'
Maybe we should do this with a custom bug parser. We could also make the web UI (even) more complex.
Comment by paulproteus:
Comment by paulproteus:
Comment by soonick5:
Thanks, Adrian
Comment by paulproteus:
> > > I have a few questions before I can start making some progress on this ticket: > > - Should I be looking at oh-mainline or oh-bugimporters?
oh-bugimporters
> - Does the bug indexing happens on a cron job or is it manually triggered?
cron job
> - It would be helpful to know what is the current result and what is the > expected result so I know when I can mark this as completed.
The currnt result is that when you parse a Trac bug, the 'difficulty' key from the data table is ignored.
The expected result is that the 'difficulty' field can be used to infer the value of "good_for_newcomers".
An easy implementation is to create a TracBugParser subclass that overrides the extract_tracker_specific_data() method (I think that's what it's called; blanking on name of method at the moment). oh-bugimporters/bugimporters/bugzilla.py has sample BugParser subclasses. Unlike the Bugzilla ones, you'll subclass TracBugParser and put it in trac.py.
-- Asheesh.
Comment by soonick5:
I started taking a look at the bug importers to make the suggested changes but now I realized that I don't know how to run a bug importer so I can test what it is doing.
Do you know how I can do a run of the bug importer or point me to documentation about how to test this?
Thanks, Adrian
Comment by paulproteus:
Comment by soonick5:
Thanks again for all your useful answers.
I followed all the instructions from our chat but when it came to play with the Bugs class I didn't know what to do.
I also tried looking at the data being updated in the database but I didn't know which table I needed to look at. In this same subject, a lot of the documentation talks about a MySQL database but the only database I see in my local is a SQLite database. Is this configured to be this way based on the environment or do I need to have a MySQL database that I currently don't have?
I did a very shallow check of what this does: python manage.py customs_daily_tasks and I didn't find where the actual importing of the code is made. Do you think you can point me to a file I can follow and move from it?
I am still pretty new to Python and Django so I may be asking some very dump questions. If I am taking too long to do something that should be really easy I would be happy to move a simpler task.
Thanks, Adrian
Comment by paulproteus:
Here's what I suggest: I'll write more documentation for oh-bugimporters so that this simple task is actually simple.
Having said that, I'll also answer your questions briefly here.
On Tue, 26 Jun 2012, soonick5 wrote:
> Hello, > > Thanks again for all your useful answers. > > I followed all the instructions from our chat but when it came to play > with the Bugs class I didn't know what to do.
It's a Django model, so the shell gives you an interface to list all the rows in the database that correspond to that model.
You can get a list via:
Bub.all_bugs.all()
> > I also tried looking at the data being updated in the database but I didn't know > which table I needed to look at. In this same subject, a lot of the > documentation talks about a MySQL database but the only database I see in my > local is a SQLite database. Is this configured to be this way based on the > environment or do I need to have a MySQL database that I currently don't have?
It's the sqlite database. We changed the defaults a year ago, or so, and now that sqlite is the default.
> > I did a very shallow check of what this does: python manage.py > customs_daily_tasks and I didn't find where the actual importing of the code is > made. Do you think you can point me to a file I can follow and move from it?
Yes! Will do shortly.
> I am still pretty new to Python and Django so I may be asking some very > dump questions. If I am taking too long to do something that should be > really easy I would be happy to move a simpler task.
I think we can make this task work. Shortly I'll publish better docs and give you a link in this ticket.
Comment by soonick5:
$ python manage.py customs_daily_tasks $ python manage.py shell_plus
And then this on the shell:
>>> Bug.all_bugs.all()
And I got this:
[<Bug: title='0' project='GNOME-Do' projectlanguage='C#' description='...'>, <Bug: title='1' project='d29187f31e2e4ba1b2ac43a74d6a40bf' projectlanguage='C' description='...'>]
Comment by soonick5:
$ python manage.py customs_daily_tasks
Isn't updating the db
Comment by soonick5:
Comment by paulproteus:
I suggest that you keep things simple for now and try a more focused approach for this change. Instead of looking at the Django code, focus on the contents of oh-bugimporters, and use the oh-bugimporters test suite to play with the bug import process.
In the future it will be easier to run individual bug importers and see what data they download... perhaps you can even help with that transition in the future. For now, I'd say stick to running the code via the test suite.
I wrote up some new documentation for the oh-bugimporters code. Can you read http://oh-bugimporters.readthedocs.org/ and let me know if that clarifies things?
Especially pay attention to the "The role of multiple BugParsers" discussion.
-- Asheesh.
Comment by soonick5:
Thanks for creating that documentation, it was very helpful,
I was finally able to write some code (patch attached) that I hope you can take a look and let me know your comments, but I also have some questions.
Thanks, Adrian
Comment by soonick5:
Just checking if you have had time to check my patch?
Regards, Adrian
Comment by soonick5:
It's been some time since I submitted this patch and you haven't checked it. If you are too busy, do you know if there is someone else who can help me review this?
Thanks, Adrian
Comment by paulproteus:
Comment by paulproteus:
Your patch is going along the right track -- with just a few changes, it can land.
Actually -- by the way -- your solution is more elegant than what I had imagined. So let's stick with your way and just fix a few things.
First, cosmetic problems. Your top commit has "Wip" in the commit log message, as if you saved your work but didn't clean things up before submitting. I recommend using "git commit --amend" to change just a commit log message, and "git rebase -i origin/master" to change more than that.
The value "good_for_newcomers" is supposed to just have a Boolean value. There's no way you would necessarily know that, before-hand, but because of what it corresponds to in the database a bool is all you need. No need for the helpful (but verbose) string you have there.
Beyond that:
Good job adding the test data!
The way that TrackerModel objects are supposed to work, at the moment you can only select a single value for bitesized_type. So you should implement logic that isn't splitting them on "," but instead supports each value individually.
To make that work well, you should have a separate TrackerModel for the difficulty: easy option. To make that, just copy-paste the current TrackerModel and name it HaskellTrackerModel, and change the fields as needed (for example, change "bitesized_type" to "difficulty", etc.) in the new HaskellTrackerModel.
I think that should be it.
I said up-top this is more tidy than the way I was thinking -- I was initially imagining that we'd have to create a separate BugParser class for this case, but you showed that it can be tidily implemented within the current TracBugParser. That is awesome to see!
I'm no longer doing as much active travel, so I should be much more available for reviews.
Thanks!
-- Asheesh.
Comment by soonick5:
Thanks for taking the time to check my patch. I have a few questions about your comments, tough.
I am not sure what is the verbose string you are talking about. Can you tell me which one is it?
Also, I am a little confused about what the next steps are. What I understood is: Remove the modification I did to the TrackerModel(Mock) class on the init file and instead create a new one called HaskellTrackerModel. If this is correct...should I make HaskellTrackerModel inherit from TrackerModel instead of Mock and just overwrite "bitesized_type" and "bitesized_text" attributes?
Is this the only change I need to make?
Thanks, Adrian
Comment by paulproteus:
Sorry about my lack of clarity earlier! To reply to your questions:
By the "verbose string"I mean this:
('The bug is considered ' + 'good_for_newcomers because the value of the difficulty ' + 'field is: "Easy (less than 1 hour)"')
Good question about precise next steps. To clarify that:
Other mini next steps:
That should be it. Thanks for your patience, and also for the code! Ping this ticket if you have any other questions.
Comment by soonick5:
Thanks for the instructions. I am attaching a new patch so you can give it a final review before I submit the pull request on github.
I like adding comments to my assertions so readers of the tests can more easily understand what each assertion is doing. For that reason I didn't remove the comment from my assertion. If this goes against any standard you have or for any reason think it is not correct I can remove it.
Please take a look and let me know if everything is correct, and if it is I will make a pull request on github.
Thanks, Adrian
Comment by paulproteus:
This patch has just two trivial problems:
This line ends with a space character, this sort of "trailing whitespace" can cause merge conflicts down the road. So it's better to remove the space character from the end of the line.
Also, in the Mock, you write: "Haskell(GHK) tracker" -- actually, it's "GHC", not GHK.
Beyond that, it looks great. I would merge it as-is, and if you want to make the above changes before submitting the pull request, that'd be fine with me.
Thank you for your hard work!
Comment by soonick5:
Thanks for all your help. I just submitted the pull request after making the corrections you suggested: https://github.com/openhatch/oh-bugimporters/pull/9
Regards, Adrian
Comment by soonick5:
I submitted my pull request a while ago and nobody has checked it. Can you please take a look and let me know if everything is correct.
I will try to connect to IRC one of these days to talk about the next steps.
Regards, Adrian Ancona
Comment by paulproteus:
Now I realize I was looking at the wrong repository! My gravest apologies; expect a review today!
Comment by paulproteus:
Thanks!
File at http://roundup-archive.openhatch.org/bugs/file514/0001-When-difficulty-field-is-Easy-less-than-1-hour-then-.patch by soonick5
Status: resolved Nosy List: mlinksva, paulproteus, soonick5 Priority: feature Imported from roundup ID: 626 (view archived page) Last modified: 2012-07-27.06:37:30