Open jensstutte opened 1 year ago
@jensstutte and @suhaibmujahid. I am new to this and have started looking into this issue. So, far I have figured out that the component model training outputs a component model (pickle file?). Is there a documentation on how to read this file? Or if this is not the right direction than which file, do you recommend would have all the necessary features (creation date, completion date, bug ID and associated components) for creating this time series of maintenance effectiveness values?
@gothwalritu this issue is pretty complex, I'd suggest picking something else.
@marco-c, I have some experience with time series analysis, and I would like to give it a shot.
@gothwalritu unfortunately there are quite a few other things to do, which require context, in order to retrieve the data to analyze, and then the actual work for this issue can start.
@marco-c, @suhaibmujahid I have been trying to understand the problem and the associated data. In the "element chat room" @jpangas mentioned a few links to go through first to develop an understanding of this task. I did the same and so far, I could understand that there are different products, component and teams.
component_name team_name product_name
0 Fennec Mozilla Untriaged Bugs 1 Firefox Mozilla Untriaged Bugs 2 General Mozilla Untriaged Bugs 3 Compiler Mozilla Rhino Graveyard 4 Core Mozilla Rhino Graveyard ... ... ... ... 2308 FlightDeck Mozilla Mozilla Labs Graveyard 2309 Personas Plus Mozilla Mozilla Labs Graveyard 2310 Test Pilot Mozilla Mozilla Labs Graveyard 2311 Test Pilot Data Requests Mozilla Mozilla Labs Graveyard 2312 Test Pilot Studies Mozilla Mozilla Labs Graveyard
The bugs are categorized in them.
The objective is to create a visualization of the time series of maintenance effectiveness indicator (MEI) values. So, first I will try to work with a one team, maybe "mozilla". The function to calculate MEI is defined in "bugzilla.py", I can use it to calculate the MEI.
team = 'Mozilla'
start_year = 2000 end_year = 2022
mei_values = {}
I tried plotting the graph as:
plt.title('Maintenance Effectiveness Indicator (MEI) Over Years for team: mozilla') plt.xlabel('Year') plt.ylabel('MEI Value')
And I am planning to do the time series analysis maybe using ARIMA for the same team. Let me know if I am going in the right direction. Otherwise, I will drop this task and move onto another one.
This looks promising. I will have to update a bit the requirements such that it is more clear what actually we want to see here based on which calculations (cadence and time delta) and how to put that together, please expect that to happen by Friday. Note that ultimately we will want to integrate this into bugbug's UI, but for now we are happy to have a standalone PoC. Note also that it is probably more representative for testing to use a different team than "Mozilla", you might want to try "DOM LWS", for example.
Thank you so much! I will try with DOM LWS.
So for a given team
(or other component selection) like "DOM LWS" and a given time period graphduration
(say a year as example) we want to:
for week in graphduration
for delta in [week -1w, week - 1m, week - 3m]
calculate ME, BDTime, WBDTime, Incoming, Closed with team, week, delta
Note that we want to calculate all those values ad-hoc on the latest data we have in bugzilla, even if they seem to be historical (bugs might move between components, change severity, etc).
This gives us a series of 3 value quintuples for each week over the graphduration
. We can directly plot (with different colors for week
, month
and 3 months
delta) the values for ME
, BDTime
and WBDTime
.
For the Incoming and Closed values we should chose only one delta and scale the value up to 12 months. I'd probably try to do so for the 1 month values (not only for the ease of calculation but also for the medium dynamic I expect). This yields for each week the percentage of bugs that would have been incoming/closed for an entire year (if done at at the same rate) compared to our defect backlog. It gives an immediate feeling of "how big is my technical debt backlog" in relation to active work and incoming bugs.
Infinite values for BDTime/WBDTime can appear and may just be ignored for now, but sooner or later we'll need a good way to show them in order to make clear "here is a problem".
@jensstutte, thank you for the explanation. It really helped me a lot to dive deeper into this. Here are the steps I will be following:
Instead of looping over years, I will loop over weeks within the specified year range. For each week, I will calculate the MEI for the 1-week, 1-month, and 3-month deltas.
For each week, I will store the MEI, BDTime, WBDTime, Incoming, and Closed values for the three different deltas.
I will create separate plots for each metric (ME, BDTime, WBDTime) with different colors or markers for each delta.
For Incoming and Closed values, I should choose one delta and scale the values up to a 12-month period.
Let me know if that is good enough for the start.
Let me know if that is good enough for the start.
That sounds right. We can probably have BDTime and WBDTime on the same chart from the beginning, as they have the same scale and should have a comparable order of magnitude (and if not we would want to directly know). I think in the end I'd like to combine the burn down chart with the Incoming/Closed data as second vertical scale as supporting information. But let's first see how dense the information gets on single charts.
Ok. So, far I am able to produce these results with: Team: 'DOM LWS' Start year = 2021 end year = 2022
week | delta | MEI | BDTime | WBDTime | Incoming | Closed |
---|---|---|---|---|---|---|
2021-01-01 | 7 | 371.428571 | 4.276712 | 2.022783 | 0.446429 | 0.892857 |
2021-01-01 | 30 | 160.576923 | 8.331258 | 2.614481 | 4.618117 | 5.595027 |
2021-01-01 | 90 | 144.632768 | 9.480397 | 3.127449 | 13.811189 | 16.346154 |
2021-01-08 | 7 | 144.117647 | 3.563927 | 2.562192 | 1.248885 | 1.784121 |
2021-01-08 | 30 | 154.629630 | 6.545988 | 2.791734 | 4.340124 | 5.580159 |
2021-01-08 | 90 | 140.909091 | 10.182648 | 3.431507 | 13.835377 | 16.199650 |
I plotted the graph for MEI value
I am working on the burn down chart now will update you as soon as I am finished with it. In the meantime, I would greatly appreciate your feedback. Thanks so much!
Here is the graph for BDTime and WBDTime on the same chart:
This chart showing the progression of the incoming and closed values:
And here is the combined chart with two vertical axis:
There is too much info in one chart.
Sorry for not coming back earlier here (actually I was convinced I had answered already time ago but I do not see the answer, so maybe I did not hit the send button or such?)!
This looks all very promising. I think in general we do not need to go back such a long period in time, 12 months should be enough. And we mostly want to look at the 3 months delta value in practice, so to keep things more readable we could remove the 1-month-delta curve everywhere. The weekly values then show the peaks and the 3-months-value shows the average we want to move to our target.
Thank you!
No problem at all! I appreciate your feedback and clarifications. It sounds like we're on the right track. I'll make the adjustments as per your suggestions: focusing on 12 months, removing the 1-month-delta curve, and emphasizing the 3-months-delta curve in the MEI and (W)BD Time charts. We'll also limit the scales and add graphical indicators as needed. I'll work on these refinements and will keep you updated on the progress. Just a heads up, I'll be on vacation for the next couple of weeks, so there might be a brief delay in my responses during that time. Thank you for your understanding!
I modified the MEI chart accordingly and here is the result:
@jensstutte, @marco-c Let me know your feedback on this. I am working on the other two charts, will update you once I am ready. Thank you.
That looks very good, thank you! The only minor thing might be to not waste so much vertical space for values going below 0, which is not possible by definition. And to use a more recent time interval, like all 2023 / last 12 months. Looking forward to the other updates!
@jensstutte, I am glad that I am able to produce these results with your guidance. In this chart I removed the vertical space below 0 in MEI and have used the 12 months of 2023. Here is the updated chart for MEI over weeks for team " DOM LWS".
@jensstutte, here is the chart for BDT and WBDT over 12 months of 2023 with 3-Month Delta for team DOM LWS with limited scale to 15years:
I am working on other charts as well. I will update the progress accordingly. Please let me know your comments on the previous charts. Thank you!
@jensstutte, I am glad that I am able to produce these results with your guidance. In this chart I removed the vertical space below 0 in MEI and have used the 12 months of 2023. Here is the updated chart for MEI over weeks for team " DOM LWS". !
I think that nails it, thanks!
@jensstutte, here is the chart for BDT and WBDT over 12 months of 2023 with 3-Month Delta for team DOM LWS with limited scale to 15years:
That looks fine, too. It is interesting to see how higher severity bugs can move the needle fast for WBDTime but less fast for BDTime. And one can clearly see the correlation of MEI going close to 100% and WBDTime spiking up, which is what I expected to see.
I am working on other charts as well. I will update the progress accordingly. Please let me know your comments on the previous charts. Thank you!
Thank you!
@jensstutte, yes it's fascinating to observe how higher severity bugs impact WBDTime differently than BDTime. The correlation between MEI approaching 100% and the spike in WBDTime indicating a strong connection between maintenance effectiveness and the time to resolve bugs.
Here is the chart for Incoming and Closed. First calculated the scaled values for Incoming and Closed by dividing them by 52 (the number of weeks in a year) and then multiplying by 100 to represent them as percentages. These scaled values indicated how many bugs are being processed in a week relative to the entire defects backlog for the year.
Then, created a line plot for Scaled Incoming and Scaled Closed, adding labels, a title, a legend, and grid lines for better visualization. Here is the chart:
Let me know your comments on it.
@gothwalritu just wanted to say great work, sorry I initially tried to ask you to work on something else :)
@marco-c, Thank you so much for your kind words! I believe we do everything with our best intentions.. so please don't apologize :)
@jensstutte, I think there is some mistake I did in the last chart. Let me work on this and will get back to you ASAP.
@jensstutte, here is the chart for Incoming and Closed for the weekly values. First calculated the scaled values for Incoming and Closed by dividing them by 52 (the number of weeks in a year) and then multiplying by 100 to represent them as percentages. These scaled values indicated how many bugs are being processed in a week relative to the entire defects backlog for the year.
Then, created a line plot for Scaled Incoming and Scaled Closed, adding labels, a title, a legend, and grid lines for better visualization. Here is the chart:
The previous chart had the data points of 7 days and 90 days, that is why it was in the zigzag pattern. Kindly let me know if this chart is fine and I will create the Combined chart for 3-months-delta values. Thanks!
Thank you, that looks much better!
@jensstutte, I appreciate your comments.
I plotted the Scaled incoming and scaled closed for 90 days/3months- delta values. Here is the graph:
and the combined chart of scaled incoming/closed and BD Time/WBD Time over 12 months (2023) for 3-months-delta:
Let me know your feedback on this. Thank you so much!
Hello @jensstutte, do I need to do anything else in this issue? Let me know.
Hi @gothwalritu ! In order to better judge how we can integrate your code in our systems, would you mind sharing it via a PR here ? The ultimate goal would be to add some of these diagrams to our bugbug UI, but for now a separate script will do.
Thank you!
Once we have a time series, we could construct a nice graph visualizing the trend for some of of those values.
It should be possible to aggregate/filter as usually in bugbug UI.
Updated requirements for the graph:
For a given
team
(or other component selection) like "DOM LWS" and a given time periodgraphduration
(say a year as example) we want to:Note that we want to calculate all those values ad-hoc on the latest data we have in bugzilla, even if they seem to be historical (bugs might move between components, change severity, etc).
This gives us a series of 3 value quintuples for each week over the
graphduration
. We can directly plot (with different colors forweek
,month
and3 months
delta) the values forME
,BDTime
andWBDTime
.For the Incoming and Closed values we should chose only one delta and scale the value up to 12 months. I'd probably try to do so for the 1 month values (not only for the ease of calculation but also for the medium dynamic I expect). This yields for each week the percentage of bugs that would have been incoming/closed for an entire year (if done at at the same rate) compared to our defect backlog. It gives an immediate feeling of "how big is my technical debt backlog" in relation to active work and incoming bugs.