Open bartlettroscoe opened 6 years ago
@fryeguy52, as we discussed yesterday, this would be a fun thing to start working and it would have a large impact in making sure that people follow up on ATDM Trilinos github issues and require less of our own personal time to do so.
@fryeguy52, other just just PyGitHub, some other things to consider include:
@fryeguy52,
FYI: based on our conversation today, I just added the scope:
Also, this tool could check that if all of the tests are passing and the GitHub issue was closed, then it could automatically remove the entries for that issue from a list of
*.csv
files. And, if it detects that tests are not all passing and the GitHub issue is closed, then it could automatically re-open the issue and provide a summary of the test results.
That would make it so that we would never need to manually re-open github issues that should not have been closed and it would make it so that we would not have to manually remove entires for the *.csv
files for closed issues.
FYI: Joe Frye mentioned that Aaron L. has some code that pulls down GitHub Issue data and sends out summary emails. We need to ask Aaron to look at his scripts once we can get down to this.
CC: @rmmilewi
@rmmilewi, take a look at:
to understand the entire ATDM Trilinos build process and where this fits in and especially:
@rmmilewi, as discussed at our meeting just now, I will add the option --write-test-data-to-file=<file>
to the cdash_analyze_and_report.py
tool and I will add a REAMD.md file for how to run the tool on the ATDM Trilinos builds to generate the data your tool/bot will work with.
@rmmilewi, I added the cdash_analyze_and_report.py
option --write-test-data-to-file=<file>
and updated the TrilinosATDMStatus scripts to call that option. If you follow the instructions at:
you can get setup locally to run the driver script as:
$ ../TrilinosATDMStatus/trilinos_atdm_builds_status_driver.sh --date=yesterday \
--email-from-address= --send-email-to=${USER}@sandia.gov
That will produce the files:
$ ls -w 1 *TestData.py
promotedAtdmTrilinosTestData.py
specializedAtdmTrilinosCleanupTestData.py
You read these in Python as:
>>> testData = []
>>> testData.extend(eval(open('promotedAtdmTrilinosTestData.py', 'r').read()))
>>> testData.extend(eval(open('specializedAtdmTrilinosCleanupTestData.py', 'r').read()))
>>> print(len(testData))
167
You want to group these by the 'issue_tracker_url' field the test dicts and then you want to create an HTML string table using the function createCDashTestHtmlTableStr()
shown at:
or perhaps the function createHtmlTableStr()
shown at:
Let me know if you have questions. Otherwise, I can mock up how this might look. It is not much code to create the markdown and HTML that would go into the GitHub Issue comment.
@rmmilewil, I think I want to change these file formats over to proper JSON. When I do that at some point, I will let you know. It should be pretty easy to adjust.
@bartlettroscoe So, I'm sorting out the design and assessing the requirements based on the notes I took, the audio recording of our session, and what's described here on this issue. Could you provide me with a example copy of the data that you want the tool to read in? One that you generated, looked over, and confirmed was valid and as-expected? I know I can fetch this data myself, but I want another human in the loop on this. π
I know I can fetch this data myself, but I want another human in the loop on this.
@rmmilewi, I will be in the loop as much as needed but it should be trivial to generate the data yourself as provided in the instructions at:
However, here is the file promotedAtdmTrilinosTestData.py generated from the command:
$ ../TrilinosATDMStatus/trilinos_promoted_atdm_builds_status.sh --date=2020-03-06
That is only partial data from today but it gives tests in all of the different categories 'twiof', 'twip', 'twim' and 'twif' in the associated summary HTML email text promotedAtdmTrilinosBuilds.html.
Please make sure that you can run that driver on your local machine and generate this same data as well. Again, it should be trivial if you have Python 2.7+.
Just download those files, remove the .txt
extension and you should be able to use them right away.
Have a look at that data and let me me know if you have any questions about it. See the basic instructions above.
@bartlettroscoe It's trivial, I know, and I'll be generating plenty of data myself, but it's an important step in the process of stakeholder engagement for me that I have some agreed-upon, canonical data that captures the phenomena of interest. Same goes for any other data you think I may need (like that CSV file you showed me). In this case, either I can generate it myself and send it to you, or we can use the data you just generated, so long as I get a confirmation from you.
Speaking of requirements, does cdash_analyze_and_report.py
require Python 2? I just wanted to make sure.
Same goes for any other data you think I may need (like that CSV file you showed me).
@rmmilewi, with my current thinking the the tool and process being written in this issue should never need to look at those CVS files. All of the data needed should be present in the Python dicts for each of these tests. If there is missing data, then we need to add it.
In this case, either I can generate it myself and send it to you, or we can use the data you just generated, so long as I get a confirmation from you.
The best way to do this is so "Hey Ross, what testing day looks to be a good day for running my tool?". That is is just a string YYYY-MM-DD passed in the --date
field.
Speaking of requirements, does cdash_analyze_and_report.py require Python 2? I just wanted to make sure.
It is only currently being tested with Python 2. Making sure it works with Python 3 is not a lot of work. There is an overlap of Python 2 and 3 that allows a Python program to work with both. (I need to figure out how to add Python 3 testing to the Travis CI testing for TriBITS.)
With my current thinking the the tool and process being written in this issue should never need to look at those CVS files. All of the data needed should be present in the Python dicts for each of these tests. If there is missing data, then we need to add it.
@bartlettroscoe Ah, gotcha! Good to know!
The best way to do this is so "Hey Ross, what testing day looks to be a good day for running my tool?". That is is just a string YYYY-MM-DD passed in the --date field.
Not a bad idea. It might be worthwhile to keep me in the loop for any noteworthy test/bug-related events. It would be interesting to build up a dataset of that sort.
It is only currently being tested with Python 2. Making sure it works with Python 3 is not a lot of work. There is an overlap of Python 2 and 3 that allows a Python program to work with both. (I need to figure out how to add Python 3 testing to the Travis CI testing for TriBITS.)
Cool! From the looks of it, I can interact with your tools through your scripts, so I'm not concerned about any compatibility issues between your code and mine. For my part in this I'm leaning towards making sure my code is Python-3-compatible. It's possible that elements of this codebase could be reused in the future, and seeing as Python 2 has reached end of life, I expect Python 3 to be the norm for future development. Miranda and I reached the same conclusion about the Traffic code we've been developing. Of course, I'm open to any ideas or preferences you might have.
I can interact with your tools through your scripts, so I'm not concerned about any compatibility issues between your code and mine.
@rmmilewi, not entirely. The code the generates the HTML tables needs to be reused. Please don't write that stuff again. That will just make more code to maintain.
It's possible that elements of this codebase could be reused in the future, and seeing as Python 2 has reached end of life, I expect Python 3 to be the norm for future development.
The reality is that as long as Python 2.7 is the default Python on SNL CEO RHEL7 systems, our Python tools will need to support Pythong 2.7 for many years to come.
The code the generates the HTML tables needs to be reused. Please don't write that stuff again. That will just make more code to maintain.
@bartlettroscoe Oh, of course. I don't plan on reinventing anything that we already have, don't worry.
The reality is that as long as Python 2.7 is the default Python on SNL CEO RHEL7 systems, our Python tools will need to support Python 2.7 for many years to come.
Fair point. I'll be testing the code against 2.7, but I'll also be testing against 3.x because I should be able to write this code in a 2-or-3 compatible way anyway. If that fails, I will at least ensure that everything can run smoothly on 2.7.
I'll be testing the code against 2.7, but I'll also be testing against 3.x because I should be able to write this code in a 2-or-3 compatible way anyway. If that fails, I will at least ensure that everything can run smoothly on 2.7.
See the advice in:
The full TriBITS test suite has been previously ported and tested with Python 2.6, 2.7 and some version of 3.x. Just need to add a Python 3 build to the TriBITS Travis CI testing to make sure this is maintained.
@rmmilewi, FYI: As explained in https://github.com/TriBITSPub/TriBITS/issues/305 you are going to see some tests with the test dict 'status' field value Missing / Failed
(in addition to Passed
, Non Run
, Failed
, and Missing
). I am not sure how this special test status should be handled in the bot being designed and written in the Issue. For the purposes of informing Trilinos developers, you don't want to report this as "Missing" or "Failed".
@rmmilewi, let's try to have a much of the conversation about this clearly UUR topic in this GitHub issue and not in emails. (Emails don't provide any long-term traceability. We want design discussion archived for all time.)
From: Bartlett, Roscoe A Sent: Wednesday, May 27, 2020 9:40 AM To: Milewicz, Reed rmilewi@sandia.gov Subject: RE: Regarding demo of Grover
Yes, we know GitHub Markdown supports the HTML tables as produced by the Python code used in cdash_analyze_and_report.py. That is how I produced the tables shown in:
https://github.com/trilinos/Trilinos/issues/3887#issue-381335600
For now, letβs just let them render the way they are. The tables shown in:
https://github.com/trilinos/Trilinos/issues/3887#issue-381335600
look okay to start with.
We can worry about better rendering in a later iteration.
-Ross
Dr. Roscoe A. Bartlett, PhD https://bartlettroscoe.github.io/ Sandia National Laboratories
From: Milewicz, Reed rmilewi@sandia.gov Sent: Wednesday, May 27, 2020 9:36 AM To: Bartlett, Roscoe A rabartl@sandia.gov Subject: Re: Regarding demo of Grover
Ah, sorry, meant to get back to you yesterday. I should have an answer today, hopefully! In single-responsibility-principle fashion, I divided up comment-posting functionality into an IssueTrackerPublisher class and a IssueTrackerCommentFormatter class, and there's a stub for TriBITSCommentFormatter that I was saving for this week.
Previously, I did some testing to make sure that GitHub markdown supports raw HTML tables (it does), and if it turns out that they're not rendering the way that we want, we could either modify the code in TriBITS, or do some find/replace magic on the resulting string. In any case, I should have an answer soon.
@bartlettroscoe Hey there,
Question for you, just so I'm clear on the requirements for Grover.
Right now, I can pull in any data that gets generated from any of the scripts you run, can check against a CSV file, etc.β For example, if I just wanted to poll the status of promoted ATDM builds, I'd specify that script in the config file:
{
'scriptPath' : "trilinos_promoted_atdm_builds_status.sh",
'arguments' : ["--date=today"],
'results' : [
{'resultPath' : "promotedAtdmTrilinosTestData.py",
'behavior' : "EVALUATE",
β'label' : "promotedAtdmTrilinosTestData"
},
]
}
When I call grover updateissues
, Grover can call the scripts, parse the CDash entries, organize them by issue tracker, and then kick off a set of requests to post updates to those threads (assuming they exist, are open, and are unlocked):
[debug] Request created to publish digest to issue #5892 (1 entries associated with this issue).
[debug] Request created to publish digest to issue #6790 (1 entries associated with this issue).
[debug] Request created to publish digest to issue #5002 (2 entries associated with this issue).
[debug] Request created to publish digest to issue #6796 (8 entries associated with this issue).
[debug] Request created to publish digest to issue #5006 (1 entries associated with this issue).
[debug] Request created to publish digest to issue #6799 (8 entries associated with this issue).
[debug] Request created to publish digest to issue #6801 (3 entries associated with this issue).
[debug] Request created to publish digest to issue #6804 (1 entries associated with this issue).
[debug] Request created to publish digest to issue #6805 (4 entries associated with this issue).
[debug] Request created to publish digest to issue #6553 (1 entries associated with this issue).
[debug] Request created to publish digest to issue #6051 (1 entries associated with this issue).
[debug] Request created to publish digest to issue #5545 (2 entries associated with this issue).
[debug] Request created to publish digest to issue #7089 (7 entries associated with this issue).
[debug] Request created to publish digest to issue #7090 (3 entries associated with this issue).
[debug] Request created to publish digest to issue #6070 (1 entries associated with this issue).
[debug] Request created to publish digest to issue #6455 (6 entries associated with this issue).
[debug] Request created to publish digest to issue #6329 (6 entries associated with this issue).
[debug] Request created to publish digest to issue #6333 (1 entries associated with this issue).
[debug] Request created to publish digest to issue #5310 (2 entries associated with this issue).
[debug] Request created to publish digest to issue #6216 (2 entries associated with this issue).
[debug] Request created to publish digest to issue #6991 (4 entries associated with this issue).
[debug] Request created to publish digest to issue #6882 (36 entries associated with this issue).
[debug] Request created to publish digest to issue #6246 (2 entries associated with this issue).
I tested out the table-generating code in CDashAnalyzeAndReport on GitHub by routing the output to a demo repository I created, and the tables look good. Grover can report the passing, missing and failing tests associated which the issue. Everything runs on 2.7+ and 3.7+, and I have numerous unit tests covering it all (well, almost, just need to add a few more). β Anyway, question for you, assuming everything I just said sounds reasonable (if not, then I will work to make it reasonable by next Tuesday, haha). Which builds are you wanting to report on? All the ATDM builds? Just the builds listed in the CSV file of your choice? I just wanted to confirm exactly what you were wanting in this iteration. Thanks!
@rmmilewi, this sounds really great!
Can you post a sample in a comment below in this issue of that one of the more interesting generated comments will look like? (Or even post a few examples, covering passing, missing, and failing tests.)
Anyway, question for you, assuming everything I just said sounds reasonable (if not, then I will work to make it reasonable by next Tuesday, haha). Which builds are you wanting to report on? All the ATDM builds? Just the builds listed in the CSV file of your choice? I just wanted to confirm exactly what you were wanting in this iteration.
That is a great question. If possible, can you please combine the test results for the two scripts:
in order to create the master list? We have GitHub issues that relate to builds going to the (promoted) 'ATDM' CDash group and and the 'Specialized' CDash group and some GitHub issues involve tests from both of these sets.
Does that make sense?
@rmmilewi, one issue that I just realized is that we need to consider how to handle missing builds (and therefore incomplete test results). Currently, the cdash_analyze_and_report.py
tool will list out missing builds at the top of the summary email but will not list out the associated tests with those missing builds. The issue is that we don't want to post a comment to a GitHub issue listing only passing tests but not mentioning anything about tests that might be failing but have missing test results because the associated builds were missing. The Trilinos developer might think that everything is good and close the issue but there might be failing tests in the missing builds posted the next day (or later days).
My initial idea is to add a new option --report-missing-tests-from-missing-builds=on
to the cdash_analyze_and_report.py
tool that would result in tests associated with a given missing build to be reported as missing (perhaps with status Build Missing
?) and therefore have the those tests also show up in the output files specializedAtdmTrilinosCleanupTestData.py
and specializedAtdmTrilinosCleanupTestData.py
. Therefore, your Grover tool would not need to change at all (but would have to pass in an extra argument to the invocation of the driver scripts).
What do you think about that?
Anyway, we can discuss that idea when we talk.
@rmmilewi,
I should mention that when you run these tools you will need to turn off the emails getting sent out by using the argument --send-email-to=
. Also, you need to run this after the last testing day is complete so you will need to run with --date=yesterday
instead of --date=today
. So putting this all together you would have:
[
{
'scriptPath' : "trilinos_promoted_atdm_builds_status.sh",
'arguments' : ["--date=yesterday", "--send-email-to="],
'results' : [
{'resultPath' : "promotedAtdmTrilinosTestData.py",
'behavior' : "EVALUATE",
β'label' : "promotedAtdmTrilinosTestData"
},
]
} ,
{
'scriptPath' : "trilinos_specialized_atdm_builds_status.sh",
'arguments' : ["--date=yesterday", "--send-email-to="],
'results' : [
{'resultPath' : "specializedAtdmTrilinosCleanupTestData.py",
'behavior' : "EVALUATE",
β'label' : "specializedAtdmTrilinosTestData"
},
]
}
]
Make sense?
Yeah, it all makes sense to me. Everything looks fine on my end, but when I try to run those scripts now, I get...
**Response**: Failure
**Message**: Encountered an unexpected exception.
StatusScriptResult could not find a valid file that matches this path: promotedAtdmTrilinosTestData.py
Grover makes a hard stop if any of the scripts fail to produce their expected outputs. Meanwhile, when I inspect the output of CDashQueryAnalyzeAndReport, I see...
Getting test history for tests with issue trackers passing or missing: num=74
Getting 30 days of history for MueLu_FixedMatrixPattern-Tpetra_MPI_4 in the build Trilinos-atdm-waterman-cuda-9.2-debug on waterman from cache file
Traceback (most recent call last):
File "/Users/rmilewi/SEMS/grover/TrilinosATDMStatus//TriBITS/tribits/ci_support/cdash_analyze_and_report.py", line 882, in <module>
requireMatchTestTopTestHistory=inOptions.requireTestHistoryMatchNonpassingTests,
File "/Users/rmilewi/SEMS/grover/TrilinosATDMStatus/TriBITS/tribits/ci_support/CDashQueryAnalyzeReport.py", line 226, in foreachTransform
list_inout[i] = transformFunctor(list_inout[i])
File "/Users/rmilewi/SEMS/grover/TrilinosATDMStatus/TriBITS/tribits/ci_support/CDashQueryAnalyzeReport.py", line 1528, in __call__
testHistoryLOD, self.__date, self.__testingDayStartTimeUtc, daysOfHistory)
File "/Users/rmilewi/SEMS/grover/TrilinosATDMStatus/TriBITS/tribits/ci_support/CDashQueryAnalyzeReport.py", line 1210, in sortTestHistoryGetStatistics
sortedTestHistoryLOD = getUniqueSortedTestsHistoryLOD(sortedTestHistoryLOD)
File "/Users/rmilewi/SEMS/grover/TrilinosATDMStatus/TriBITS/tribits/ci_support/CDashQueryAnalyzeReport.py", line 1309, in getUniqueSortedTestsHistoryLOD
if not checkCDashTestDictsAreSame(candidateTestDict, "a", lastUniqueTestDict, "b")[0]:
File "/Users/rmilewi/SEMS/grover/TrilinosATDMStatus/TriBITS/tribits/ci_support/CDashQueryAnalyzeReport.py", line 1357, in checkCDashTestDictsAreSame
extractTestIdAndBuildIdFromTestDetailsLink(testDict_1['testDetailsLink'])
File "/Users/rmilewi/SEMS/grover/TrilinosATDMStatus/TriBITS/tribits/ci_support/CDashQueryAnalyzeReport.py", line 1322, in extractTestIdAndBuildIdFromTestDetailsLink
phpArgsList = testDetailsLinkList[1].split('&')
IndexError: list index out of range
Error, could not compute the analysis due to above error so return failed!
And this is what's getting called by your script:
cdash_analyze_and_report.py \
--date='2020-05-28' \
--cdash-project-testing-day-start-time='04:01' \
--cdash-project-name='Trilinos' \
--build-set-name='Promoted ATDM Trilinos Builds' \
--cdash-site-url='https://testing-dev.sandia.gov/cdash' \
--cdash-builds-filters='filtercount=2&showfilters=1&filtercombine=and&field1=groupname&compare1=61&value1=ATDM&field2=buildname&compare2=65&value2=Trilinos-atdm-' \
--cdash-nonpassed-tests-filters='filtercount=6&showfilters=1&filtercombine=and&field1=groupname&compare1=61&value1=ATDM&field2=buildname&compare2=65&value2=Trilinos-atdm-&field3=status&compare3=62&value3=passed&field4=testoutput&compare4=94&value4=Error%20initializing%20RM%20connection.%20Exiting&field5=testoutput&compare5=94&value5=OPAL%20ERROR%3A%20Unreachable&field6=testoutput&compare6=96&value6=srun%3A%20error%3A%20s_p_parse_file%3A%20unable%20to%20read%20.%2Fetc%2Fslurm%2Fslurm.conf.%3A%20Permission%20denied' \
--expected-builds-file='/Users/rmilewi/SEMS/grover/TrilinosATDMStatus//promotedAtdmTrilinosExpectedBuilds.csv' \
--tests-with-issue-trackers-file='/Users/rmilewi/SEMS/grover/TrilinosATDMStatus//promotedAtdmTrilinosTestsWithIssueTrackers.csv' \
--cdash-queries-cache-dir='/Users/rmilewi/SEMS/grover' \
--cdash-base-cache-files-prefix='promotedAtdmTrilinosBuilds_' \
--use-cached-cdash-data='off' \
--limit-test-history-days='30' \
--limit-table-rows='200' \
--require-test-history-match-nonpassing-tests='off' \
--print-details='off' \
--write-failing-tests-without-issue-trackers-to-file='promotedAtdmTrilinosTwoif.csv' \
--write-test-data-to-file='promotedAtdmTrilinosTestData.py' \
--write-email-to-file='promotedAtdmTrilinosBuilds.html' \
--email-from-address='' \
--send-email-to='' \
Do you know why this might be happening? I tried running the scripts without Grover, and I get the same error. It generates the html file (promotedAtdmTrilinosBuilds.html
and the JSON files (promotedAtdmTrilinosBuilds_fullCDashIndexBuilds.json
and promotedAtdmTrilinosBuilds_fullCDashNonpassingTests.json
) but stops short of generating the expected python file (promotedAtdmTrilinosTestData.py
).
@rmmilewi, pull updated versions of TriBITS and TrilinosATDMStatus repos. The next upgrade of CDash being evaluated on testing-dev.sandia.gov/cdash/ had a non-backward compatible change to the json data. See https://github.com/TriBITSPub/TriBITS/commit/5b3345e213e0e1ca8f023dd801beda8afef0b1c6 for the fix (which works with old and new CDash).
Yeah, that fixed it. Thanks!
Okay, so, below you'll find an example of what Grover currently produces, in this case for issue #7089, which has 14 test result entries associated (listed either under the promoted ATDM builds or the specialized ones). To publish these comments, Grover will need an account of his own that is part of the Trilinos organization and a personal access token that gives permission to post to issues; I've been testing the capability under my own name and with my own token on a dummy repository.
This was computed about 10 minutes ago, so let me know if what you're seeing matches what you would expect to see (no missing tests, all the columns you wanted, etc.). A few observations...
π Hello! This is an automated comment generated by Grover. Each week, I collate and report data from CDash in an automated way to make it easier for developers to stay on top of their issues. I saw that there are tests being tracked on CDash that are associated with this open issue, and I have compiled the status information on each for you.
If you would like me to stop posting comments to this thread, comment "grover shoo" (case-insensitive). If you have a question, please reach out to Ross. I'm just a cat.
@rmmilewi, some feedback ...
A) I think we want a separate table for each category of tests and a header at the top like:
Like you see in the example above. I think some simple refactoring and movement of code in the CDashQueryAnalyzeReport.py
and cdash_analyze_and_report.py
modules will make this easy.
Where is the Grover source code so that I can look at this?
B) I think we want to move the Grover intro paragraph to the bottom of the comment (and might even compress it in a block like:
C) I think we should not list out how to stop making the comments come. That is, we should remove the paragraph:
If you would like me to stop posting comments to this thread, comment "grover shoo" (case-insensitive). If you have a question, please reach out to Ross. I'm just a cat.
The way that developers make these comments go away is to fix their tests (or disable them). Otherwise, we may need to disable the testing of their package in ATDM Trilinos testing if they don't want to see these reminders.
@bartlettroscoe
think we want a separate table for each category of tests and a header at the top like [...]
That's easy enough to do on Grover's end. I can tally up that information easily, unless you want to do that via CDashQueryAnalyzeReport and I simply call a function of yours. Either way works for me.
compress it in a block
I had no idea we could hide comment text like that, that's perfect. I like this solution.
I think we should not list out how to stop making the comments come. [...] The way that developers make these comments go away is to fix their tests (or disable them). Otherwise, we may need to disable the testing of their package in ATDM Trilinos testing if they don't want to see these reminders.
That's a very good point. We shouldn't make it easy to simply ignore these test results. I'll remove that line.
@bartlettroscoe As for those tallies, I should point out that all the CDash entries get converted to an internal format Grover uses which can seamlessly be converted back to dictionaries of the form that CDashQueryAnalyzeReport expects. This makes it easier for me to encapsulate data sanitization steps and it ensures that ill-formed/incomplete/unrecognized data can never cause Grover to fail outright.
One consequence of this is that it allows me to have typed interactions with the CDash data; it's straightforward for me to perform queries over all the CDash result objects. If you want additional information computed based on the results, I can handle that on my end. That is, if you'd like.
I can tally up that information easily, unless you want to do that via CDashQueryAnalyzeReport and I simply call a function of yours. Either way works for me.
@rmmilewi, let me take a look at the code in CDashQueryAnalyzeReport.py
and cdash_analyze_and_report.py
and I will get back to you. I suspect we can factor out some simple functions that Grover can just call, and we can exclude any troublesome data like the <style>...</style>
block that GitHub Markdown is not handing well.
The more reusable code that lives TriBITS that we can leverage, the better.
One consequence of this is that it allows me to have typed interactions with the CDash data; it's straightforward for me to perform queries over all the CDash result objects.
Just so we are clear, I don't think Grover should directly query CDash. It should just operate on the data provides as outputs from (indirect) cdash_analyze_and_report.py
calls.
The more reusable code that lives TriBITS that we can leverage, the better.
That's a good point. There are definitely opportunities to refactor that code in TriBITS to make it more reusable.
Just so we are clear, I don't think Grover should directly query CDash. It should just operate on the data provides as outputs from (indirect) cdash_analyze_and_report.py calls.
Oh, of course, I agree. Best to keep those separate. I have no plans to interact with CDash in any way, just indirectly through cdash_analyze_and_report.
I just meant that when you pass me data, it's captured in a way that I can write checks/queries against the data that are guaranteed to succeed. Like the status flag is encoded as an enum, and I ensure that even if the status is missing, is garbage, is all in upper case, is of a novel type, etc. there will always be an intelligible unit of data for me to act upon (even if it's just StatusFlag.UNKNOWN
). Or like how the issue number needs to be a string in some contexts ("#5732"
) and an integer (5372
) when passed to PyGithub, I don't have to think about that in the code.
I should point out that all the CDash entries get converted to an internal format Grover uses which can seamlessly be converted back to dictionaries of the form that CDashQueryAnalyzeReport expects.
@rmmilewi, any chance I can see the Grover code in its current state? That will provide a guide for any refactoring of CDashQueryAnalyzeReport.py
and cdash_analyze_and_report.py
.
@bartlettroscoe Oh, of course! I'll see about granting you access here in just a moment.
Okay, so I just granted you access to the repository on gitlab-ex: https://gitlab-ex.sandia.gov/rmilewi/grover
In grover/core/interactor.py, you'll find TriBITSCommentFormatter
on line 287. That's where I interact with CDashAnalyzeAndReport to generate the table. The formatter is used by UpdateIssueTrackerInteractor
, which you can find directly below on line 317.
Let me know if you have any questions!
In grover/core/interactor.py, you'll find TriBITSCommentFormatter on line 287. That's where I interact with CDashAnalyzeAndReport to generate the table. The formatter is used by UpdateIssueTrackerInteractor, which you can find directly below on line 317.
@rmmilewi, okay, it will be straightforward to replace that code with calls to TriBITS code. I will provide a single function that will do everything.
And not that you will not be providing daysOfHistory
. That is actually embedded in the data (and if it is not, I will add it because that is not something you can change arbitrarily , it was determined at the time the data was pulled down off CDash).
@bartlettroscoe
@rmmilewi, okay, it will be straightforward to replace that code with calls to TriBITS code. I will provide a single function that will do everything.
Sounds good to me!
And not that you will not be providing daysOfHistory. That is actually embedded in the data (and if it is not, I will add it because that is not something you can change arbitrarily , it was determined at the time the data was pulled down off CDash).
Right, I know. I just wasn't sure from where to pull that information in the event that it changes, because the default value is hidden away in the script (--limit-test-history-days='30'
), and I didn't want to hardcode that information anywhere in Grover. It might be in the CDash data I get as input though, I should go back and check.
Just about ready to merge TriBITs code that generates the follow type of HTML for a comment.
Tests with issue trackers Failed: twif=8
Tests with issue trackers Not Run: twinr=1
Site | Build Name | Test Name | Status | Details | Consecutive Non-pass Days | Non-pass Last 30 Days | Pass Last 30 Days | Issue Tracker |
---|---|---|---|---|---|---|---|---|
cee-rhel6 | Trilinos-atdm-cee-rhel6-clang-opt-serial | Teko_ModALPreconditioner_MPI_1 | Not Run | Required Files Missing | 15 | 15 | 0 | #3638 |
CC: @rmmilewi
I just merged the TriBITS PR https://github.com/TriBITSPub/TriBITS/pull/322 to the TriBITS 'master' branch that created a new class IssueTrackerTestsStatusReporter. I plugged this into the 'grover' code in the MR:
I manually posted what the content would look like as posted by 'grover' to a few Trilinos GitHub Issues:
There are a few issues that need to be fixed in 'grover' before we can deploy the first version. This is outlined in:
The only one that needs to get fixed before we can deploy and initial version of this tool is to refactor the code to not copy the test dicts into a different data-structure and then copy them back. The data was being copied back incorrectly in some cases and it does not support the addition of new fields. Once that is fixed, I think we can deploy the first version that just updates the open issues once a week.
CC: @rmmilewi
A problem with this approach of just embedding the tables in the GitHub Issue comments as you can see in:
is that the test summary tables are hard to read. First, GitHub Issue comments are very narrow and you can't see all of the columns of the table and makes for very tall rows. Second, the colors are not shown which makes the tables harder to read.
Therefore, it would be good to find an easy way for developers to view these tables in their full HTML spender. One approach might be to attach the HTML as a file to the GitHub Issue comment. I showed how this might look in:
However, you can't directly attach and *.html
file and instead have to attach an *.html.txt
file. Then the developer needs to download that file, change the extension from *.html.txt
to *.html
and then open that file in a browser. So this is not very nice and I don't think many developers will bother to do that.
Another approach would be to post the HTML to website and then view it from there. We could do that with a GitHub pages site maintained with with a GitHub repo, for example. Or, we could just post it to the trilinos.org site. Just to show what that might look like, I manually posted:
and put in a link to this from:
That looks much better and is much more readable. I think we need to do something like this to make this information more readable and useful for developers.
Below I log some details of how I tested this with 'grover' and the subtasks that I did in the last week.
FYI: Meet with @rmmilewi yesterday to go over the current status of Grover towards the minimum viable product. There are the todos:
<html>
block.--dummyIssueId=<issueid>
to submit the status of all of the tracked tests as new comments to the dummy Trilinos GitHub issue. (That will both validate that the GitHub authentication is working correctly and it will show, again, what the comments look like.)@bartlettroscoe @rmmilewi Thanks for setting this up -- it looks to be really useful. Just wanted to mention a small typo: the Grover message is missing the word "are" just after "there":
Grover saw that there tests being tracked on CDash that are associated with this open issue.
Just wanted to mention a small typo: the Grover message is missing the word "are" just after "there"
@jhux2, thanks for the catch! Should be fixed going forward.
We now have finally deployed the first "Minimum Viable Product" that adds a single comment per open ATDM Trilinos GitHub issue as described in ATDV-365. The Jenkins project:
is set up to run once a week at 7 AM on Mondays to post these comments. I sent out the following email to the Trilinos developers yesterday, and then I manually ran that above Jenkins project which resulting in 11 comments posted and the log output showed:
16:47:35 [main.py] issue #6126, success=False, message:The issue (#6126) is not open. Right now, Grover avoids publishing results to closed issues.
16:47:35 [main.py] issue #6246, success=False, message:The issue (#6246) is not open. Right now, Grover avoids publishing results to closed issues.
16:47:35 [main.py] issue #3862, success=False, message:The issue (#3862) is not open. Right now, Grover avoids publishing results to closed issues.
16:47:35 [main.py] issue #6882, success=False, message:The issue (#6882) is not open. Right now, Grover avoids publishing results to closed issues.
16:47:35 [main.py] issue #6540, success=True, message:None
16:47:35 [main.py] issue #6799, success=True, message:None
16:47:35 [main.py] issue #7778, success=True, message:None
16:47:35 [main.py] issue #3863, success=True, message:None
16:47:35 [main.py] issue #6790, success=True, message:None
16:47:35 [main.py] issue #6991, success=False, message:The issue (#6991) is not open. Right now, Grover avoids publishing results to closed issues.
16:47:35 [main.py] issue #7690, success=False, message:The issue (#7690) is not open. Right now, Grover avoids publishing results to closed issues.
16:47:35 [main.py] issue #6009, success=True, message:None
16:47:35 [main.py] issue #6216, success=True, message:None
16:47:35 [main.py] issue #6553, success=True, message:None
16:47:35 [main.py] issue #6455, success=True, message:None
16:47:35 [main.py] issue #5006, success=True, message:None
16:47:35 [main.py] issue #7089, success=True, message:None
and posted the comments:
The next step is to update the comments so they give suggestions on if the issue can be closed since the issues are addressed.
Related to:
This issue has had no activity for 365 days and is marked for closure. It will be closed after an additional 30 days of inactivity.
If you would like to keep this issue open please add a comment and/or remove the MARKED_FOR_CLOSURE
label.
If this issue should be kept open even with no activity beyond the time limits you can add the label DO_NOT_AUTOCLOSE
.
If it is ok for this issue to be closed, feel free to go ahead and close it. Please do not add any comments or change any labels or otherwise touch this issue unless your intention is to reset the inactivity counter for an additional year.
CC: @fryeguy52, @trilinos/framework
Description
As part of the work to implement a Python tool to pull down, analyze and the summarize CDash build and test data in #2933, we realized that we had all of the information that would be needed to update Trilinos GitHub issues about the status of tests associated with Trilinos GitHub issues. The idea would be to add a GitHub issue comment containing tables showing the current status of the tests related to a GitHub Issue. An example of what this could look like is shown in https://github.com/trilinos/Trilinos/issues/3579#issuecomment-438324283 and https://github.com/trilinos/Trilinos/issues/3833#issuecomment-438317812 where I just manually copied and pasted the HTML-formatted tables and rows for those issues right into the the GitHub comments.
For #3579 on 11/14/2018, the results for 11/13/2018 might look like:
Test results for #3579 as of testing day 2018-11-13
Tests with issue trackers Passed: twip=2 Tests with issue trackers Missing: twim=3
NOTE: With all of the tests associated with this issue passing or missing (i.e. disabled), might this Issue be addressed and perhaps be closed?
Detailed test results: (click to expand)
Tests with issue trackers Passed: twip=2
Tests with issue trackers Missing: twim=3
So the idea is is that once that comment was added, the Trilinos developer responsible for the github issue could add a comment stating that this was not a randomly failing test so having this test pass or be disabled indicated that the issue could be resolved and then close the issue. No need to look at CDash directly, add in new CDash links, etc. Just comment and close. That would save a lot of human time.
So we would add these comments in the following cases:
Once a week just as a reminder of the current status of the tests related to this issue (so that Trilinos developers would not forget about the issue).
When one of the associated tests changed status (e..g went from passing to failing or failings to passing). (But not for frequent randomly failing tests or that would create a lot of spam updates).
When all of associated tests were were all passing or missing for X (e.g. 2) consecutive days, like shown above. (But not for any randomly failing tests or that would create a lot of spam.)
When all of the associated tests are all passing for the full acquired test history (e.g. 30 days) or are missing for frequent randomly failing tests. (But not for rare randomly failing tests.)
This should make it so that people don't need to manually check on the status of the associated tests for a GitHub issue. They could just let an automated system (that will be created in this Story) update the GitHub issue when something worth noting has occurred and when the issue might be closed.
Also, this tool could check that if all of the tests are passing and the GitHub issue was closed, then it could automatically remove the entries for that issue from a list of
*.csv
files. And, if it detects that tests are not all passing and the GitHub issue is closed, then it could automatically re-open the issue and provide a summary of the test results.Related
Tasks
cdash_analyze_and_report.py
tool to write detailed test information and history for all tests with issue trackers being monitored by the tool. [Done]update_github_issues_for_tests_status.py
that will update GitHub issues as described above given the output from thecdash_analyze_and_report.py
tool described above ... See grover_update_trilinos_github_issues_with_test_status.sh [Done]cdash_analyze_and_report.py
to provide thefail_frequency
field and put in output data-structure for each issue.CDashQueryAnalyzeReport.py
to provide pass/fail criteria for each issue and provide suggestions for when to close an issue based on if tests passing for X days matchingfail_frequency
criteria. (needs more analysis)*.csv
file for Issues that are closed and have met the passing criteria based onfail_frequency
logic (see above).