Create automated tool to update ATDM Trilinos GitHub issues with current status of tests attached to those issues

bartlettroscoe commented 6 years ago

CC: @fryeguy52, @trilinos/framework

Description

As part of the work to implement a Python tool to pull down, analyze and the summarize CDash build and test data in #2933, we realized that we had all of the information that would be needed to update Trilinos GitHub issues about the status of tests associated with Trilinos GitHub issues. The idea would be to add a GitHub issue comment containing tables showing the current status of the tests related to a GitHub Issue. An example of what this could look like is shown in https://github.com/trilinos/Trilinos/issues/3579#issuecomment-438324283 and https://github.com/trilinos/Trilinos/issues/3833#issuecomment-438317812 where I just manually copied and pasted the HTML-formatted tables and rows for those issues right into the the GitHub comments.

For #3579 on 11/14/2018, the results for 11/13/2018 might look like:

Test results for #3579 as of testing day 2018-11-13

Tests with issue trackers Passed: twip=2 Tests with issue trackers Missing: twim=3

NOTE: With all of the tests associated with this issue passing or missing (i.e. disabled), might this Issue be addressed and perhaps be closed?

Detailed test results: (click to expand)

Tests with issue trackers Passed: twip=2

Site	Build Name	Test Name	Status	Details	Consecutive Days Pass	Nopass last 30 Days	Pass last 30 Days	Tracker
hansen	Trilinos-atdm-hansen-shiller-cuda-8.0-opt	PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4	Passed	Completed	4	34	4	#3579
white	Trilinos-atdm-white-ride-cuda-9.2-opt	PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4	Passed	Completed	2	26	3	#3579

Tests with issue trackers Missing: twim=3

Site	Build Name	Test Name	Status	Details	Consecutive Days Missing	Nopass last 30 Days	Tracker
hansen	Trilinos-atdm-hansen-shiller-cuda-8.0-debug	PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4	Missing	Missing	4	26	#3579
hansen	Trilinos-atdm-hansen-shiller-cuda-9.0-debug	PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4	Missing	Missing	4	25	#3579
white	Trilinos-atdm-white-ride-cuda-9.2-debug	PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4	Missing	Missing	4	25	#3579

So the idea is is that once that comment was added, the Trilinos developer responsible for the github issue could add a comment stating that this was not a randomly failing test so having this test pass or be disabled indicated that the issue could be resolved and then close the issue. No need to look at CDash directly, add in new CDash links, etc. Just comment and close. That would save a lot of human time.

So we would add these comments in the following cases:

Once a week just as a reminder of the current status of the tests related to this issue (so that Trilinos developers would not forget about the issue).
When one of the associated tests changed status (e..g went from passing to failing or failings to passing). (But not for frequent randomly failing tests or that would create a lot of spam updates).
When all of associated tests were were all passing or missing for X (e.g. 2) consecutive days, like shown above. (But not for any randomly failing tests or that would create a lot of spam.)
When all of the associated tests are all passing for the full acquired test history (e.g. 30 days) or are missing for frequent randomly failing tests. (But not for rare randomly failing tests.)

This should make it so that people don't need to manually check on the status of the associated tests for a GitHub issue. They could just let an automated system (that will be created in this Story) update the GitHub issue when something worth noting has occurred and when the issue might be closed.

Also, this tool could check that if all of the tests are passing and the GitHub issue was closed, then it could automatically remove the entries for that issue from a list of *.csv files. And, if it detects that tests are not all passing and the GitHub issue is closed, then it could automatically re-open the issue and provide a summary of the test results.

Tasks

Update the cdash_analyze_and_report.py tool to write detailed test information and history for all tests with issue trackers being monitored by the tool. [Done]
Experiment with how to robustly add comments to a GitHub issue. (e.g. look into PyGitHub?) [Done]
Write beginning Python code with unit tests and a Python tool called something like update_github_issues_for_tests_status.py that will update GitHub issues as described above given the output from the cdash_analyze_and_report.py tool described above ... See grover_update_trilinos_github_issues_with_test_status.sh [Done]
Provide first implementation that just posts the list of passing, missing and failing tests for each open Trilinos GitHub issue once a week (Minimum Viable Product) ... See below [Done]
Extend cdash_analyze_and_report.py to provide the fail_frequency field and put in output data-structure for each issue.
Extend CDashQueryAnalyzeReport.py to provide pass/fail criteria for each issue and provide suggestions for when to close an issue based on if tests passing for X days matching fail_frequency criteria. (needs more analysis)
Write a tool to remove entries from the *.csv file for Issues that are closed and have met the passing criteria based on fail_frequency logic (see above).
Extend Grover to write comments to Trilinos GitHub issues that are closed but fail the passing criteria based on 'fail_frequency'.

bartlettroscoe commented 6 years ago

@fryeguy52, as we discussed yesterday, this would be a fun thing to start working and it would have a large impact in making sure that people follow up on ATDM Trilinos github issues and require less of our own personal time to do so.

bartlettroscoe commented 6 years ago

@fryeguy52, other just just PyGitHub, some other things to consider include:

bartlettroscoe commented 5 years ago

@fryeguy52,

FYI: based on our conversation today, I just added the scope:

Also, this tool could check that if all of the tests are passing and the GitHub issue was closed, then it could automatically remove the entries for that issue from a list of *.csv files. And, if it detects that tests are not all passing and the GitHub issue is closed, then it could automatically re-open the issue and provide a summary of the test results.

That would make it so that we would never need to manually re-open github issues that should not have been closed and it would make it so that we would not have to manually remove entires for the *.csv files for closed issues.

bartlettroscoe commented 4 years ago

FYI: Joe Frye mentioned that Aaron L. has some code that pulls down GitHub Issue data and sends out summary emails. We need to ask Aaron to look at his scripts once we can get down to this.

bartlettroscoe commented 4 years ago

CC: @rmmilewi

bartlettroscoe commented 4 years ago

@rmmilewi, take a look at:

https://snl-wiki.sandia.gov/display/CoodinatedDevOpsATDM/ATDM+Builds+of+Trilinos

to understand the entire ATDM Trilinos build process and where this fits in and especially:

https://snl-wiki.sandia.gov/display/CoodinatedDevOpsATDM/Triaging+and+Addressing+ATDM+Trilinos+Failures

bartlettroscoe commented 4 years ago

@rmmilewi, as discussed at our meeting just now, I will add the option --write-test-data-to-file=<file> to the cdash_analyze_and_report.py tool and I will add a REAMD.md file for how to run the tool on the ATDM Trilinos builds to generate the data your tool/bot will work with.

bartlettroscoe commented 4 years ago

@rmmilewi, I added the cdash_analyze_and_report.py option --write-test-data-to-file=<file> and updated the TrilinosATDMStatus scripts to call that option. If you follow the instructions at:

https://gitlab-ex.sandia.gov/atdm/TrilinosATDMStatus/-/blob/master/README.md

you can get setup locally to run the driver script as:

$ ../TrilinosATDMStatus/trilinos_atdm_builds_status_driver.sh --date=yesterday \
  --email-from-address= --send-email-to=${USER}@sandia.gov

That will produce the files:

$ ls -w 1 *TestData.py
promotedAtdmTrilinosTestData.py
specializedAtdmTrilinosCleanupTestData.py

You read these in Python as:

>>> testData = []
>>> testData.extend(eval(open('promotedAtdmTrilinosTestData.py', 'r').read()))
>>> testData.extend(eval(open('specializedAtdmTrilinosCleanupTestData.py', 'r').read()))
>>> print(len(testData))
167

You want to group these by the 'issue_tracker_url' field the test dicts and then you want to create an HTML string table using the function createCDashTestHtmlTableStr() shown at:

https://github.com/TriBITSPub/TriBITS/blob/master/tribits/ci_support/CDashQueryAnalyzeReport.py#L2098

or perhaps the function createHtmlTableStr() shown at:

https://github.com/TriBITSPub/TriBITS/blob/master/tribits/ci_support/CDashQueryAnalyzeReport.py#L1920

Let me know if you have questions. Otherwise, I can mock up how this might look. It is not much code to create the markdown and HTML that would go into the GitHub Issue comment.

bartlettroscoe commented 4 years ago

@rmmilewil, I think I want to change these file formats over to proper JSON. When I do that at some point, I will let you know. It should be pretty easy to adjust.

rmmilewi commented 4 years ago

@bartlettroscoe So, I'm sorting out the design and assessing the requirements based on the notes I took, the audio recording of our session, and what's described here on this issue. Could you provide me with a example copy of the data that you want the tool to read in? One that you generated, looked over, and confirmed was valid and as-expected? I know I can fetch this data myself, but I want another human in the loop on this. 😄

bartlettroscoe commented 4 years ago

I know I can fetch this data myself, but I want another human in the loop on this.

@rmmilewi, I will be in the loop as much as needed but it should be trivial to generate the data yourself as provided in the instructions at:

https://gitlab-ex.sandia.gov/atdm/TrilinosATDMStatus/-/blob/master/README.md

However, here is the file promotedAtdmTrilinosTestData.py generated from the command:

$ ../TrilinosATDMStatus/trilinos_promoted_atdm_builds_status.sh --date=2020-03-06

That is only partial data from today but it gives tests in all of the different categories 'twiof', 'twip', 'twim' and 'twif' in the associated summary HTML email text promotedAtdmTrilinosBuilds.html.

Please make sure that you can run that driver on your local machine and generate this same data as well. Again, it should be trivial if you have Python 2.7+.

Just download those files, remove the .txt extension and you should be able to use them right away.

Have a look at that data and let me me know if you have any questions about it. See the basic instructions above.

rmmilewi commented 4 years ago

@bartlettroscoe It's trivial, I know, and I'll be generating plenty of data myself, but it's an important step in the process of stakeholder engagement for me that I have some agreed-upon, canonical data that captures the phenomena of interest. Same goes for any other data you think I may need (like that CSV file you showed me). In this case, either I can generate it myself and send it to you, or we can use the data you just generated, so long as I get a confirmation from you.

Speaking of requirements, does cdash_analyze_and_report.py require Python 2? I just wanted to make sure.

bartlettroscoe commented 4 years ago

Same goes for any other data you think I may need (like that CSV file you showed me).

@rmmilewi, with my current thinking the the tool and process being written in this issue should never need to look at those CVS files. All of the data needed should be present in the Python dicts for each of these tests. If there is missing data, then we need to add it.

In this case, either I can generate it myself and send it to you, or we can use the data you just generated, so long as I get a confirmation from you.

The best way to do this is so "Hey Ross, what testing day looks to be a good day for running my tool?". That is is just a string YYYY-MM-DD passed in the --date field.

Speaking of requirements, does cdash_analyze_and_report.py require Python 2? I just wanted to make sure.

It is only currently being tested with Python 2. Making sure it works with Python 3 is not a lot of work. There is an overlap of Python 2 and 3 that allows a Python program to work with both. (I need to figure out how to add Python 3 testing to the Travis CI testing for TriBITS.)

rmmilewi commented 4 years ago

With my current thinking the the tool and process being written in this issue should never need to look at those CVS files. All of the data needed should be present in the Python dicts for each of these tests. If there is missing data, then we need to add it.

@bartlettroscoe Ah, gotcha! Good to know!

The best way to do this is so "Hey Ross, what testing day looks to be a good day for running my tool?". That is is just a string YYYY-MM-DD passed in the --date field.

Not a bad idea. It might be worthwhile to keep me in the loop for any noteworthy test/bug-related events. It would be interesting to build up a dataset of that sort.

It is only currently being tested with Python 2. Making sure it works with Python 3 is not a lot of work. There is an overlap of Python 2 and 3 that allows a Python program to work with both. (I need to figure out how to add Python 3 testing to the Travis CI testing for TriBITS.)

Cool! From the looks of it, I can interact with your tools through your scripts, so I'm not concerned about any compatibility issues between your code and mine. For my part in this I'm leaning towards making sure my code is Python-3-compatible. It's possible that elements of this codebase could be reused in the future, and seeing as Python 2 has reached end of life, I expect Python 3 to be the norm for future development. Miranda and I reached the same conclusion about the Traffic code we've been developing. Of course, I'm open to any ideas or preferences you might have.

bartlettroscoe commented 4 years ago

I can interact with your tools through your scripts, so I'm not concerned about any compatibility issues between your code and mine.

@rmmilewi, not entirely. The code the generates the HTML tables needs to be reused. Please don't write that stuff again. That will just make more code to maintain.

It's possible that elements of this codebase could be reused in the future, and seeing as Python 2 has reached end of life, I expect Python 3 to be the norm for future development.

The reality is that as long as Python 2.7 is the default Python on SNL CEO RHEL7 systems, our Python tools will need to support Pythong 2.7 for many years to come.

rmmilewi commented 4 years ago

The code the generates the HTML tables needs to be reused. Please don't write that stuff again. That will just make more code to maintain.

@bartlettroscoe Oh, of course. I don't plan on reinventing anything that we already have, don't worry.

The reality is that as long as Python 2.7 is the default Python on SNL CEO RHEL7 systems, our Python tools will need to support Python 2.7 for many years to come.

Fair point. I'll be testing the code against 2.7, but I'll also be testing against 3.x because I should be able to write this code in a 2-or-3 compatible way anyway. If that fails, I will at least ensure that everything can run smoothly on 2.7.

bartlettroscoe commented 4 years ago

I'll be testing the code against 2.7, but I'll also be testing against 3.x because I should be able to write this code in a 2-or-3 compatible way anyway. If that fails, I will at least ensure that everything can run smoothly on 2.7.

See the advice in:

https://stackoverflow.com/questions/50079311/how-to-write-code-that-works-in-both-python-2-and-python-3

The full TriBITS test suite has been previously ported and tested with Python 2.6, 2.7 and some version of 3.x. Just need to add a Python 3 build to the TriBITS Travis CI testing to make sure this is maintained.

bartlettroscoe commented 4 years ago

@rmmilewi, FYI: As explained in https://github.com/TriBITSPub/TriBITS/issues/305 you are going to see some tests with the test dict 'status' field value Missing / Failed (in addition to Passed, Non Run, Failed, and Missing). I am not sure how this special test status should be handled in the bot being designed and written in the Issue. For the purposes of informing Trilinos developers, you don't want to report this as "Missing" or "Failed".

bartlettroscoe commented 4 years ago

@rmmilewi, let's try to have a much of the conversation about this clearly UUR topic in this GitHub issue and not in emails. (Emails don't provide any long-term traceability. We want design discussion archived for all time.)

From: Bartlett, Roscoe A Sent: Wednesday, May 27, 2020 9:40 AM To: Milewicz, Reed rmilewi@sandia.gov Subject: RE: Regarding demo of Grover

Yes, we know GitHub Markdown supports the HTML tables as produced by the Python code used in cdash_analyze_and_report.py. That is how I produced the tables shown in:

https://github.com/trilinos/Trilinos/issues/3887#issue-381335600

For now, let’s just let them render the way they are. The tables shown in:

https://github.com/trilinos/Trilinos/issues/3887#issue-381335600

look okay to start with.

We can worry about better rendering in a later iteration.

-Ross

Dr. Roscoe A. Bartlett, PhD https://bartlettroscoe.github.io/ Sandia National Laboratories

From: Milewicz, Reed rmilewi@sandia.gov Sent: Wednesday, May 27, 2020 9:36 AM To: Bartlett, Roscoe A rabartl@sandia.gov Subject: Re: Regarding demo of Grover

Ah, sorry, meant to get back to you yesterday. I should have an answer today, hopefully! In single-responsibility-principle fashion, I divided up comment-posting functionality into an IssueTrackerPublisher class and a IssueTrackerCommentFormatter class, and there's a stub for TriBITSCommentFormatter that I was saving for this week.

Previously, I did some testing to make sure that GitHub markdown supports raw HTML tables (it does), and if it turns out that they're not rendering the way that we want, we could either modify the code in TriBITS, or do some find/replace magic on the resulting string. In any case, I should have an answer soon.

Reed

rmmilewi commented 4 years ago

@bartlettroscoe Hey there,

Question for you, just so I'm clear on the requirements for Grover.

Right now, I can pull in any data that gets generated from any of the scripts you run, can check against a CSV file, etc. For example, if I just wanted to poll the status of promoted ATDM builds, I'd specify that script in the config file:

            {
                    'scriptPath' : "trilinos_promoted_atdm_builds_status.sh",
                    'arguments' : ["--date=today"],
                    'results' : [
                            {'resultPath' : "promotedAtdmTrilinosTestData.py", 
                             'behavior' : "EVALUATE", 
                             'label' : "promotedAtdmTrilinosTestData"
                            },
                    ]
            }

When I call grover updateissues, Grover can call the scripts, parse the CDash entries, organize them by issue tracker, and then kick off a set of requests to post updates to those threads (assuming they exist, are open, and are unlocked):

[debug] Request created to publish digest to issue #5892 (1 entries associated with this issue).
[debug] Request created to publish digest to issue #6790 (1 entries associated with this issue).
[debug] Request created to publish digest to issue #5002 (2 entries associated with this issue).
[debug] Request created to publish digest to issue #6796 (8 entries associated with this issue).
[debug] Request created to publish digest to issue #5006 (1 entries associated with this issue).
[debug] Request created to publish digest to issue #6799 (8 entries associated with this issue).
[debug] Request created to publish digest to issue #6801 (3 entries associated with this issue).
[debug] Request created to publish digest to issue #6804 (1 entries associated with this issue).
[debug] Request created to publish digest to issue #6805 (4 entries associated with this issue).
[debug] Request created to publish digest to issue #6553 (1 entries associated with this issue).
[debug] Request created to publish digest to issue #6051 (1 entries associated with this issue).
[debug] Request created to publish digest to issue #5545 (2 entries associated with this issue).
[debug] Request created to publish digest to issue #7089 (7 entries associated with this issue).
[debug] Request created to publish digest to issue #7090 (3 entries associated with this issue).
[debug] Request created to publish digest to issue #6070 (1 entries associated with this issue).
[debug] Request created to publish digest to issue #6455 (6 entries associated with this issue).
[debug] Request created to publish digest to issue #6329 (6 entries associated with this issue).
[debug] Request created to publish digest to issue #6333 (1 entries associated with this issue).
[debug] Request created to publish digest to issue #5310 (2 entries associated with this issue).
[debug] Request created to publish digest to issue #6216 (2 entries associated with this issue).
[debug] Request created to publish digest to issue #6991 (4 entries associated with this issue).
[debug] Request created to publish digest to issue #6882 (36 entries associated with this issue).
[debug] Request created to publish digest to issue #6246 (2 entries associated with this issue).

I tested out the table-generating code in CDashAnalyzeAndReport on GitHub by routing the output to a demo repository I created, and the tables look good. Grover can report the passing, missing and failing tests associated which the issue. Everything runs on 2.7+ and 3.7+, and I have numerous unit tests covering it all (well, almost, just need to add a few more). Anyway, question for you, assuming everything I just said sounds reasonable (if not, then I will work to make it reasonable by next Tuesday, haha). Which builds are you wanting to report on? All the ATDM builds? Just the builds listed in the CSV file of your choice? I just wanted to confirm exactly what you were wanting in this iteration. Thanks!

Reed

bartlettroscoe commented 4 years ago

@rmmilewi, this sounds really great!

Can you post a sample in a comment below in this issue of that one of the more interesting generated comments will look like? (Or even post a few examples, covering passing, missing, and failing tests.)

Anyway, question for you, assuming everything I just said sounds reasonable (if not, then I will work to make it reasonable by next Tuesday, haha). Which builds are you wanting to report on? All the ATDM builds? Just the builds listed in the CSV file of your choice? I just wanted to confirm exactly what you were wanting in this iteration.

That is a great question. If possible, can you please combine the test results for the two scripts:

trilinos_promoted_atdm_builds_status.sh
trilinos_specialized_atdm_builds_status.sh

in order to create the master list? We have GitHub issues that relate to builds going to the (promoted) 'ATDM' CDash group and and the 'Specialized' CDash group and some GitHub issues involve tests from both of these sets.

Does that make sense?

bartlettroscoe commented 4 years ago

@rmmilewi, one issue that I just realized is that we need to consider how to handle missing builds (and therefore incomplete test results). Currently, the cdash_analyze_and_report.py tool will list out missing builds at the top of the summary email but will not list out the associated tests with those missing builds. The issue is that we don't want to post a comment to a GitHub issue listing only passing tests but not mentioning anything about tests that might be failing but have missing test results because the associated builds were missing. The Trilinos developer might think that everything is good and close the issue but there might be failing tests in the missing builds posted the next day (or later days).

My initial idea is to add a new option --report-missing-tests-from-missing-builds=on to the cdash_analyze_and_report.py tool that would result in tests associated with a given missing build to be reported as missing (perhaps with status Build Missing?) and therefore have the those tests also show up in the output files specializedAtdmTrilinosCleanupTestData.py and specializedAtdmTrilinosCleanupTestData.py. Therefore, your Grover tool would not need to change at all (but would have to pass in an extra argument to the invocation of the driver scripts). What do you think about that?

Anyway, we can discuss that idea when we talk.

bartlettroscoe commented 4 years ago

@rmmilewi,

I should mention that when you run these tools you will need to turn off the emails getting sent out by using the argument --send-email-to=. Also, you need to run this after the last testing day is complete so you will need to run with --date=yesterday instead of --date=today. So putting this all together you would have:

  [
            {
                    'scriptPath' : "trilinos_promoted_atdm_builds_status.sh",
                    'arguments' : ["--date=yesterday", "--send-email-to="],
                    'results' : [
                            {'resultPath' : "promotedAtdmTrilinosTestData.py", 
                             'behavior' : "EVALUATE", 
                             'label' : "promotedAtdmTrilinosTestData"
                            },
                    ]
            } ,
            {
                    'scriptPath' : "trilinos_specialized_atdm_builds_status.sh",
                    'arguments' : ["--date=yesterday", "--send-email-to="],
                    'results' : [
                            {'resultPath' : "specializedAtdmTrilinosCleanupTestData.py", 
                             'behavior' : "EVALUATE", 
                             'label' : "specializedAtdmTrilinosTestData"
                            },
                    ]
            } 
      ]

Make sense?

rmmilewi commented 4 years ago

Yeah, it all makes sense to me. Everything looks fine on my end, but when I try to run those scripts now, I get...

**Response**: Failure
**Message**: Encountered an unexpected exception.
         StatusScriptResult could not find a valid file that matches this path: promotedAtdmTrilinosTestData.py

Grover makes a hard stop if any of the scripts fail to produce their expected outputs. Meanwhile, when I inspect the output of CDashQueryAnalyzeAndReport, I see...

Getting test history for tests with issue trackers passing or missing: num=74
Getting 30 days of history for MueLu_FixedMatrixPattern-Tpetra_MPI_4 in the build Trilinos-atdm-waterman-cuda-9.2-debug on waterman from cache file

Traceback (most recent call last):
  File "/Users/rmilewi/SEMS/grover/TrilinosATDMStatus//TriBITS/tribits/ci_support/cdash_analyze_and_report.py", line 882, in <module>
    requireMatchTestTopTestHistory=inOptions.requireTestHistoryMatchNonpassingTests,
  File "/Users/rmilewi/SEMS/grover/TrilinosATDMStatus/TriBITS/tribits/ci_support/CDashQueryAnalyzeReport.py", line 226, in foreachTransform
    list_inout[i] = transformFunctor(list_inout[i])
  File "/Users/rmilewi/SEMS/grover/TrilinosATDMStatus/TriBITS/tribits/ci_support/CDashQueryAnalyzeReport.py", line 1528, in __call__
    testHistoryLOD, self.__date, self.__testingDayStartTimeUtc, daysOfHistory)
  File "/Users/rmilewi/SEMS/grover/TrilinosATDMStatus/TriBITS/tribits/ci_support/CDashQueryAnalyzeReport.py", line 1210, in sortTestHistoryGetStatistics
    sortedTestHistoryLOD = getUniqueSortedTestsHistoryLOD(sortedTestHistoryLOD)
  File "/Users/rmilewi/SEMS/grover/TrilinosATDMStatus/TriBITS/tribits/ci_support/CDashQueryAnalyzeReport.py", line 1309, in getUniqueSortedTestsHistoryLOD
    if not checkCDashTestDictsAreSame(candidateTestDict, "a", lastUniqueTestDict, "b")[0]:
  File "/Users/rmilewi/SEMS/grover/TrilinosATDMStatus/TriBITS/tribits/ci_support/CDashQueryAnalyzeReport.py", line 1357, in checkCDashTestDictsAreSame
    extractTestIdAndBuildIdFromTestDetailsLink(testDict_1['testDetailsLink'])
  File "/Users/rmilewi/SEMS/grover/TrilinosATDMStatus/TriBITS/tribits/ci_support/CDashQueryAnalyzeReport.py", line 1322, in extractTestIdAndBuildIdFromTestDetailsLink
    phpArgsList = testDetailsLinkList[1].split('&')
IndexError: list index out of range

Error, could not compute the analysis due to above error so return failed!

And this is what's getting called by your script:

cdash_analyze_and_report.py \
  --date='2020-05-28' \
  --cdash-project-testing-day-start-time='04:01' \
  --cdash-project-name='Trilinos' \
  --build-set-name='Promoted ATDM Trilinos Builds' \
  --cdash-site-url='https://testing-dev.sandia.gov/cdash' \
  --cdash-builds-filters='filtercount=2&showfilters=1&filtercombine=and&field1=groupname&compare1=61&value1=ATDM&field2=buildname&compare2=65&value2=Trilinos-atdm-' \
  --cdash-nonpassed-tests-filters='filtercount=6&showfilters=1&filtercombine=and&field1=groupname&compare1=61&value1=ATDM&field2=buildname&compare2=65&value2=Trilinos-atdm-&field3=status&compare3=62&value3=passed&field4=testoutput&compare4=94&value4=Error%20initializing%20RM%20connection.%20Exiting&field5=testoutput&compare5=94&value5=OPAL%20ERROR%3A%20Unreachable&field6=testoutput&compare6=96&value6=srun%3A%20error%3A%20s_p_parse_file%3A%20unable%20to%20read%20.%2Fetc%2Fslurm%2Fslurm.conf.%3A%20Permission%20denied' \
  --expected-builds-file='/Users/rmilewi/SEMS/grover/TrilinosATDMStatus//promotedAtdmTrilinosExpectedBuilds.csv' \
  --tests-with-issue-trackers-file='/Users/rmilewi/SEMS/grover/TrilinosATDMStatus//promotedAtdmTrilinosTestsWithIssueTrackers.csv' \
  --cdash-queries-cache-dir='/Users/rmilewi/SEMS/grover' \
  --cdash-base-cache-files-prefix='promotedAtdmTrilinosBuilds_' \
  --use-cached-cdash-data='off' \
  --limit-test-history-days='30' \
  --limit-table-rows='200' \
  --require-test-history-match-nonpassing-tests='off' \
  --print-details='off' \
  --write-failing-tests-without-issue-trackers-to-file='promotedAtdmTrilinosTwoif.csv' \
  --write-test-data-to-file='promotedAtdmTrilinosTestData.py' \
  --write-email-to-file='promotedAtdmTrilinosBuilds.html' \
  --email-from-address='' \
  --send-email-to='' \

Do you know why this might be happening? I tried running the scripts without Grover, and I get the same error. It generates the html file (promotedAtdmTrilinosBuilds.html and the JSON files (promotedAtdmTrilinosBuilds_fullCDashIndexBuilds.json and promotedAtdmTrilinosBuilds_fullCDashNonpassingTests.json) but stops short of generating the expected python file (promotedAtdmTrilinosTestData.py).

bartlettroscoe commented 4 years ago

@rmmilewi, pull updated versions of TriBITS and TrilinosATDMStatus repos. The next upgrade of CDash being evaluated on testing-dev.sandia.gov/cdash/ had a non-backward compatible change to the json data. See https://github.com/TriBITSPub/TriBITS/commit/5b3345e213e0e1ca8f023dd801beda8afef0b1c6 for the fix (which works with old and new CDash).

rmmilewi commented 4 years ago

Yeah, that fixed it. Thanks!

Okay, so, below you'll find an example of what Grover currently produces, in this case for issue #7089, which has 14 test result entries associated (listed either under the promoted ATDM builds or the specialized ones). To publish these comments, Grover will need an account of his own that is part of the Trilinos organization and a personal access token that gives permission to post to issues; I've been testing the capability under my own name and with my own token on a dummy repository.

This was computed about 10 minutes ago, so let me know if what you're seeing matches what you would expect to see (no missing tests, all the columns you wanted, etc.). A few observations...

I notice that it's still getting a style HTML block which GitHub markdown ignores, even though style should be set to None. Maybe I'm passing the wrong arguments.
Right now Grover doesn't check to see if he's been shooed away, but it's easy to add that feature.
Correct me if I'm wrong, but I think I'm seeing duplicate entries. Is it possible for the same build/test to show up on the promoted and specialized results? If so, I can filter to make sure there are no duplicates. (Or wait, maybe not, maybe these are fine. Still, I should probably add a duplicate filter step when combining the data sources, just to be safe.)

🐈 Hello! This is an automated comment generated by Grover. Each week, I collate and report data from CDash in an automated way to make it easier for developers to stay on top of their issues. I saw that there are tests being tracked on CDash that are associated with this open issue, and I have compiled the status information on each for you.

If you would like me to stop posting comments to this thread, comment "grover shoo" (case-insensitive). If you have a question, please reach out to Ross. I'm just a cat.

Tests Associated With This Issue

Site	Build Name	Test Name	Status	Details	Consecutive Non-pass Days	Non-pass Last 30 Days	Pass Last 30 Days	Issue Tracker
cee-rhel6	Trilinos-atdm-cee-rhel6_clang-9.0.1_openmpi-4.0.3_serial_static_opt	SEACASIoss_pamgen_to_unstructured_cgns	Passed	Completed	0	0	15	#7089
cee-rhel6	Trilinos-atdm-cee-rhel6_gnu-7.2.0_openmpi-4.0.3_serial_shared_opt	SEACASIoss_pamgen_to_unstructured_cgns	Passed	Completed	0	0	8	#7089
cee-rhel6	Trilinos-atdm-cee-rhel6_intel-18.0.2_mpich2-3.2_openmp_static_opt	SEACASIoss_pamgen_to_unstructured_cgns	Passed	Completed	0	3	22	#7089
cee-rhel6	Trilinos-atdm-cee-rhel6_intel-19.0.3_intelmpi-2018.4_serial_static_opt	SEACASIoss_pamgen_to_unstructured_cgns	Passed	Completed	0	1	22	#7089
eclipse	Trilinos-atdm-cts1-intel-19.0.5_openmpi-4.0.1_openmp_static_dbg	SEACASIoss_pamgen_to_unstructured_cgns	Passed	Completed	0	8	14	#7089
eclipse	Trilinos-atdm-cts1-intel-19.0.5_openmpi-4.0.1_openmp_static_opt	SEACASIoss_pamgen_to_unstructured_cgns	Passed	Completed	0	8	14	#7089
vortex	Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_dbg_cuda-aware-mpi	SEACASIoss_pamgen_to_unstructured_cgns	Passed	Completed	0	1	1	#7089
vortex	Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt_cuda-aware-mpi	SEACASIoss_pamgen_to_unstructured_cgns	Passed	Completed	0	0	2	#7089
vortex	Trilinos-atdm-ats2-gnu-7.3.1-spmpi-rolling_serial_static_dbg	SEACASIoss_pamgen_to_unstructured_cgns	Passed	Completed	0	0	2	#7089
vortex	Trilinos-atdm-ats2-gnu-7.3.1-spmpi-rolling_serial_static_opt	SEACASIoss_pamgen_to_unstructured_cgns	Passed	Completed	0	0	2	#7089
vortex	Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_dbg	SEACASIoss_pamgen_to_unstructured_cgns	failed/missing	Completed (Failed)	1	1	1	#7089
vortex	Trilinos-atdm-ats2-cuda-10.1.243-gnu-7.3.1-spmpi-rolling_static_opt	SEACASIoss_pamgen_to_unstructured_cgns	failed/missing	Completed (Failed)	1	1	1	#7089
attaway	Trilinos-atdm-cts1-intel-18.0.2_openmpi-4.0.1_openmp_static_dbg	SEACASIoss_pamgen_to_unstructured_cgns	Passed	Completed	0	1	15	#7089
attaway	Trilinos-atdm-cts1-intel-18.0.2_openmpi-4.0.1_openmp_static_opt	SEACASIoss_pamgen_to_unstructured_cgns	Passed	Completed	0	1	16	#7089

bartlettroscoe commented 4 years ago

@rmmilewi, some feedback ...

A) I think we want a separate table for each category of tests and a header at the top like:

Tests with issue trackers Passed: twip=???
Tests with issue trackers Missing: twim=???
Tests with issue trackers Failed: twif = ???

Like you see in the example above. I think some simple refactoring and movement of code in the CDashQueryAnalyzeReport.py and cdash_analyze_and_report.py modules will make this easy.

Where is the Grover source code so that I can look at this?

B) I think we want to move the Grover intro paragraph to the bottom of the comment (and might even compress it in a block like:

Where is this comment coming from? (click to expand)

Hello! This is an automated comment generated by Grover. Each week, I collate and report data from CDash in an automated way to make it easier for developers to stay on top of their issues. I saw that there are tests being tracked on CDash that are associated with this open issue, and I have compiled the status information on each for you.

C) I think we should not list out how to stop making the comments come. That is, we should remove the paragraph:

If you would like me to stop posting comments to this thread, comment "grover shoo" (case-insensitive). If you have a question, please reach out to Ross. I'm just a cat.

The way that developers make these comments go away is to fix their tests (or disable them). Otherwise, we may need to disable the testing of their package in ATDM Trilinos testing if they don't want to see these reminders.

rmmilewi commented 4 years ago

@bartlettroscoe

think we want a separate table for each category of tests and a header at the top like [...]

That's easy enough to do on Grover's end. I can tally up that information easily, unless you want to do that via CDashQueryAnalyzeReport and I simply call a function of yours. Either way works for me.

compress it in a block

I had no idea we could hide comment text like that, that's perfect. I like this solution.

I think we should not list out how to stop making the comments come. [...] The way that developers make these comments go away is to fix their tests (or disable them). Otherwise, we may need to disable the testing of their package in ATDM Trilinos testing if they don't want to see these reminders.

That's a very good point. We shouldn't make it easy to simply ignore these test results. I'll remove that line.

rmmilewi commented 4 years ago

@bartlettroscoe As for those tallies, I should point out that all the CDash entries get converted to an internal format Grover uses which can seamlessly be converted back to dictionaries of the form that CDashQueryAnalyzeReport expects. This makes it easier for me to encapsulate data sanitization steps and it ensures that ill-formed/incomplete/unrecognized data can never cause Grover to fail outright.

One consequence of this is that it allows me to have typed interactions with the CDash data; it's straightforward for me to perform queries over all the CDash result objects. If you want additional information computed based on the results, I can handle that on my end. That is, if you'd like.

bartlettroscoe commented 4 years ago

I can tally up that information easily, unless you want to do that via CDashQueryAnalyzeReport and I simply call a function of yours. Either way works for me.

@rmmilewi, let me take a look at the code in CDashQueryAnalyzeReport.py and cdash_analyze_and_report.py and I will get back to you. I suspect we can factor out some simple functions that Grover can just call, and we can exclude any troublesome data like the <style>...</style> block that GitHub Markdown is not handing well.

The more reusable code that lives TriBITS that we can leverage, the better.

One consequence of this is that it allows me to have typed interactions with the CDash data; it's straightforward for me to perform queries over all the CDash result objects.

Just so we are clear, I don't think Grover should directly query CDash. It should just operate on the data provides as outputs from (indirect) cdash_analyze_and_report.py calls.

rmmilewi commented 4 years ago

The more reusable code that lives TriBITS that we can leverage, the better.

That's a good point. There are definitely opportunities to refactor that code in TriBITS to make it more reusable.

Just so we are clear, I don't think Grover should directly query CDash. It should just operate on the data provides as outputs from (indirect) cdash_analyze_and_report.py calls.

Oh, of course, I agree. Best to keep those separate. I have no plans to interact with CDash in any way, just indirectly through cdash_analyze_and_report.

I just meant that when you pass me data, it's captured in a way that I can write checks/queries against the data that are guaranteed to succeed. Like the status flag is encoded as an enum, and I ensure that even if the status is missing, is garbage, is all in upper case, is of a novel type, etc. there will always be an intelligible unit of data for me to act upon (even if it's just StatusFlag.UNKNOWN). Or like how the issue number needs to be a string in some contexts ("#5732") and an integer (5372) when passed to PyGithub, I don't have to think about that in the code.

bartlettroscoe commented 4 years ago

I should point out that all the CDash entries get converted to an internal format Grover uses which can seamlessly be converted back to dictionaries of the form that CDashQueryAnalyzeReport expects.

@rmmilewi, any chance I can see the Grover code in its current state? That will provide a guide for any refactoring of CDashQueryAnalyzeReport.py and cdash_analyze_and_report.py.

rmmilewi commented 4 years ago

@bartlettroscoe Oh, of course! I'll see about granting you access here in just a moment.

rmmilewi commented 4 years ago

Okay, so I just granted you access to the repository on gitlab-ex: https://gitlab-ex.sandia.gov/rmilewi/grover

In grover/core/interactor.py, you'll find TriBITSCommentFormatter on line 287. That's where I interact with CDashAnalyzeAndReport to generate the table. The formatter is used by UpdateIssueTrackerInteractor, which you can find directly below on line 317.

Let me know if you have any questions!

bartlettroscoe commented 4 years ago

In grover/core/interactor.py, you'll find TriBITSCommentFormatter on line 287. That's where I interact with CDashAnalyzeAndReport to generate the table. The formatter is used by UpdateIssueTrackerInteractor, which you can find directly below on line 317.

@rmmilewi, okay, it will be straightforward to replace that code with calls to TriBITS code. I will provide a single function that will do everything.

And not that you will not be providing daysOfHistory. That is actually embedded in the data (and if it is not, I will add it because that is not something you can change arbitrarily , it was determined at the time the data was pulled down off CDash).

rmmilewi commented 4 years ago

@bartlettroscoe

@rmmilewi, okay, it will be straightforward to replace that code with calls to TriBITS code. I will provide a single function that will do everything.

Sounds good to me!

And not that you will not be providing daysOfHistory. That is actually embedded in the data (and if it is not, I will add it because that is not something you can change arbitrarily , it was determined at the time the data was pulled down off CDash).

Right, I know. I just wasn't sure from where to pull that information in the event that it changes, because the default value is hidden away in the script (--limit-test-history-days='30'), and I didn't want to hardcode that information anywhere in Grover. It might be in the CDash data I get as input though, I should go back and check.

bartlettroscoe commented 4 years ago

Just about ready to merge TriBITs code that generates the follow type of HTML for a comment.

FAILED (twif=8, twinr=1): Tests for #1234 on 2018-10-28

Tests with issue trackers Failed: twif=8
Tests with issue trackers Not Run: twinr=1

Testing details: (click to expand)

Tests with issue trackers Failed: twif=8

Site	Build Name	Test Name	Status	Details	Consecutive Non-pass Days	Non-pass Last 30 Days	Issue Tracker
cee-rhel6	Trilinos-atdm-cee-rhel6-clang-opt-serial	MueLu_UnitTestsBlockedEpetra_MPI_1	Failed	Completed (Failed)	15	15	#3640
cee-rhel6	Trilinos-atdm-cee-rhel6-clang-opt-serial	PanzerAdaptersIOSS_tIOSSConnManager2_MPI_2	Failed	Completed (Failed)	15	15	#3632
cee-rhel6	Trilinos-atdm-cee-rhel6-gnu-4.9.3-opt-serial	PanzerAdaptersIOSS_tIOSSConnManager2_MPI_2	Failed	Completed (Failed)	1	1	#3632
cee-rhel6	Trilinos-atdm-cee-rhel6-intel-opt-serial	PanzerAdaptersIOSS_tIOSSConnManager2_MPI_2	Failed	Completed (Failed)	16	16	#3632
cee-rhel6	Trilinos-atdm-cee-rhel6-clang-opt-serial	PanzerAdaptersIOSS_tIOSSConnManager3_MPI_3	Failed	Completed (Failed)	15	15	#3632
cee-rhel6	Trilinos-atdm-cee-rhel6-gnu-4.9.3-opt-serial	PanzerAdaptersIOSS_tIOSSConnManager3_MPI_3	Failed	Completed (Failed)	1	1	#3632
cee-rhel6	Trilinos-atdm-cee-rhel6-intel-opt-serial	PanzerAdaptersIOSS_tIOSSConnManager3_MPI_3	Failed	Completed (Failed)	16	16	#3632
mutrino	Trilinos-atdm-mutrino-intel-opt-openmp-KNL	Stratimikos_test_single_belos_thyra_solver_driver_nos1_nrhs8_MPI_1	Failed	Completed (Failed)	30	30	#3632

Tests with issue trackers Not Run: twinr=1

Site	Build Name	Test Name	Status	Details	Consecutive Non-pass Days	Non-pass Last 30 Days	Pass Last 30 Days	Issue Tracker
cee-rhel6	Trilinos-atdm-cee-rhel6-clang-opt-serial	Teko_ModALPreconditioner_MPI_1	Not Run	Required Files Missing	15	15	0	#3638

bartlettroscoe commented 4 years ago

https://gitlab-ex.sandia.gov/rmilewi/grover/-/issues/3

bartlettroscoe commented 4 years ago

CC: @rmmilewi

I just merged the TriBITS PR https://github.com/TriBITSPub/TriBITS/pull/322 to the TriBITS 'master' branch that created a new class IssueTrackerTestsStatusReporter. I plugged this into the 'grover' code in the MR:

https://gitlab-ex.sandia.gov/rmilewi/grover/-/merge_requests/2

I manually posted what the content would look like as posted by 'grover' to a few Trilinos GitHub Issues:

There are a few issues that need to be fixed in 'grover' before we can deploy the first version. This is outlined in:

https://gitlab-ex.sandia.gov/rmilewi/grover/-/issues/3

The only one that needs to get fixed before we can deploy and initial version of this tool is to refactor the code to not copy the test dicts into a different data-structure and then copy them back. The data was being copied back incorrectly in some cases and it does not support the addition of new fields. Once that is fixed, I think we can deploy the first version that just updates the open issues once a week.

bartlettroscoe commented 4 years ago

CC: @rmmilewi

A problem with this approach of just embedding the tables in the GitHub Issue comments as you can see in:

is that the test summary tables are hard to read. First, GitHub Issue comments are very narrow and you can't see all of the columns of the table and makes for very tall rows. Second, the colors are not shown which makes the tables harder to read.

Therefore, it would be good to find an easy way for developers to view these tables in their full HTML spender. One approach might be to attach the HTML as a file to the GitHub Issue comment. I showed how this might look in:

https://github.com/trilinos/Trilinos/issues/3862#issuecomment-640112332

However, you can't directly attach and *.html file and instead have to attach an *.html.txt file. Then the developer needs to download that file, change the extension from *.html.txt to *.html and then open that file in a browser. So this is not very nice and I don't think many developers will bother to do that.

Another approach would be to post the HTML to website and then view it from there. We could do that with a GitHub pages site maintained with with a GitHub repo, for example. Or, we could just post it to the trilinos.org site. Just to show what that might look like, I manually posted:

https://tribits.org/trilinos_issue_test_status/issues/3862/2020-06-05.html

and put in a link to this from:

https://github.com/trilinos/Trilinos/issues/3862#issuecomment-640112332

That looks much better and is much more readable. I think we need to do something like this to make this information more readable and useful for developers.

bartlettroscoe commented 4 years ago

Below I log some details of how I tested this with 'grover' and the subtasks that I did in the last week.

Details: (click to expand)

**(6/6/2020)** I have this working with grover for the branch in MR: * https://gitlab-ex.sandia.gov/rmilewi/grover/-/merge_requests/2 and the TriBITS PR: * https://github.com/TriBITSPub/TriBITS/pull/322 I tested this on 'crf450' with: ``` $ cd /home/rabartl/Trilinos.base/grover/ $ cat load-env.sh module load sems-env module load sems-python/2.7.9 export https_proxy=https://wwwproxy.sandia.gov:80 export TRILINOS_ATDM_STATUS_DIR=$HOME/Trilinos.base/TrilinosATDMStatus export GROVER_TRIBITS_DIR=$HOME/Trilinos.base/TrilinosATDMStatus/TriBITS export PYTHONPATH=$PYTHONPATH:${GROVER_TRIBITS_DIR}/tribits/ci_support $ load-env.sh $ cat runMockGrover.sh #!/bin/bash python -m grover.main.main updateissues &> grover.out grep "$\|Test results for issue\|Tests with issue trackers$" grover.out \ | grep -v "

.*Tests with issue trackers" \ | grep -v "^Tests" &> grover.summary.out $ ./runMockGrover.sh ``` That provided the summary output file `grover.summary.out`: ```

Test results for issue #5892 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=1

Test results for issue #6790 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=1

Test results for issue #5002 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=2

Test results for issue #6796 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=5
Tests with issue trackers Missing: twim=3

Test results for issue #5006 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=1

Test results for issue #6799 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=8

Test results for issue #6801 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=3

Test results for issue #6804 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=1

Test results for issue #6805 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=4

Test results for issue #3862 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Failed: twif=2

Test results for issue #3863 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Failed: twif=1

Test results for issue #6553 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Failed: twif=2

Test results for issue #6554 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=1

Test results for issue #6051 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=1

Test results for issue #5545 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Failed: twif=2

Test results for issue #7089 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=8
Tests with issue trackers Missing: twim=2

Test results for issue #7090 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Missing: twim=3

Test results for issue #6070 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=1

Test results for issue #6455 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=6

Test results for issue #6329 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Missing: twim=6

Test results for issue #6333 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=1

Test results for issue #5310 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=2

Test results for issue #6216 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=1
Tests with issue trackers Failed: twif=1

Test results for issue #6991 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=6

Test results for issue #7251 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=2

Test results for issue #6882 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=27
Tests with issue trackers Missing: twim=9

Test results for issue #6246 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=6

Test results for issue #6126 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Failed: twif=1

Test results for issue #5872 as of YYYY-MM-DD (ToDo: fix grover!, see rmilewi/grover#3)

Tests with issue trackers Passed: twip=4
``` I manually posted what the content would look like to a few Trilinos GitHub Issues: * https://github.com/trilinos/Trilinos/issues/3862#issuecomment-640112332 * https://github.com/trilinos/Trilinos/issues/6329#issuecomment-640106241 Some stuff in grover needs to be fixed and described in: * https://gitlab-ex.sandia.gov/rmilewi/grover/-/issues/3

Completed tasks: (click to expand)

* Create TriBITS branch 'tril-3887-test-summary-tables-refactor' [Done] * Look over grover code some to see how it is calling TriBITS ... See the function TriBITSCommentFormatter() in the file grover/grover/core/interactor.py [Done] * Look over organization of code to see how to to provide functions to produce test results by table [Done] * Change name of testSetXXX to testsetXXX [Done] * Refactor createCDashTestHtmlTableStr() to get 'testHistoryDays' out of the test dict field 'test_history_num_days' [Done] * Move var 'testSortOrder' from cdash_analyze_and_report.py to a function 'getDefaultTestsSortOrder()' in CDashQueryAnalyzeReport.py and use it as the default argument in class constructor 'TestsetGetDataAnayzeReporter' [Done] * Factor out function createOverallSummaryLine() that takes in an 'overallVars' object and builds the summary line with 'PASSED|FAILED (....)' [Done] * Factor out a class 'Testsetinfo' that will hold data {'testsetType', 'testTypeDescr', 'testTypeCountAncryNum', 'colorTestset'} and provide the function 'getStandardTestsetInfo(testsetType)' to return the standard info for the test table types 'twoif', 'twoinr', 'twip', 'twim', 'twif', and 'twinr' and refactor code to use this [Done] * Create a new class 'AddTestHistoryStrategy' that will take in 'inOptions' and 'testHistoryCacheDir' and provide a function 'getTestHistory()' and refactor the class 'TestsetGetDataAnayzeReporter' to use it [Done] * Move the class 'TestsetGetDataAnayzeReporter' from cdash_analyze_and_report.py to CDashQueryAnalyzeReport.py rename it 'TestsetReporter' to but leave the class 'AddTestHistoryStrategy' in cdash_analyze_and_report.py [Done] * Move the class 'OverallVars' from cdash_analyze_and_report.py to CDashQueryAnalyzeReport.py and rename it 'CDashReportData' so it can be reused there [Done] * Create function binTestDictsByIssueTracker(testsLOD) => (testDictsByIssueTracker, testsWithoutIssueTrackersLOD) that will create a dictionary where the keys are the string name of the issue tracker and the values are the sublists of test dicts for that issue tracker. (Any test dicts without an issue tracker will collected into a reminder list testsWithoutIssueTrackersLOD.) (must write unit tests for this) [Done] * Create function getTestsetAcroFromTestDict(testDict) => that returns the standard test-set acronym 'twoif', 'twoinr', 'twip', 'twim', 'twif', 'twinr' given the input test dict [Done] * Clean up some raw html being used in the cdash_analyze_and_report.py [Done] * Create function binTestDictsByTestsetAcro(testsLOD) => testsByTestsetAcro for the standard test sets 'twoif', 'twoinr', 'twip', 'twim', 'twif', 'twinr' (just use a dict where each key is a stanadrd acronym type) [Done] * Change name of class TestsetReporter to SingleTestsetReporter [Done] * Create class TestsetsReporter that takes CDashReportData object and has function report(testsLOD) => (testsStatusHtmlReportStr) that takes in a testLOD and seprates them out into sublists 'twip, 'twim', etc. and adds the HTML text to display them (must write some unit tests for this): - Create failing tests for empyt list of test dicts [Done] - Get some pre-generated *.py data from cdash_analyze_and_report.py output to set up first test case [Done] - Create faling test for just 'twif' and 'twinr' [Done] - Test out generated HTML in GitHub issue comment using copy-and-paste [Done] - Fix up formatting to look okay on GitHub [Done] - Create test for all supported types [Done] * Create new class IssueTrackerTestsStatusReporter that will use TestsetsReporter to create a custom message and will handle all future logic for informing the developer about what the status of their tsets for their issue: - Copy test for TestsetsReporter and use it to create test for IssueTrackerTestsStatusReporter [Done] - Remove the top table block [Done] - Add function getIssueTrackerAndAssertAllSame(testsLOD) that will assert that all tests in testsLOD have a non-empty 'issue_tracker' field and that the fields 'issue_tracker' and 'issue_tracker_url' are all identical (add unit tests for this alone and through the class) [Done] - Add field 'cdash_testing_day' to test dicts that get written to file and then extract out to provide the testing date in IssueTrackerTestsStatusReporter [Done] * Update grover/grover/core/interactor.py to use the new class IssueTrackerTestsStatusReporter: - Create grover branch 'tril-3887-test-summary-tables-refactor' [Done] - Put in new class IssueTrackerTestsStatusReporter [Done] - Manually run grover to see the generated output [Done] - Run the grover unit tests and update if any fail ... I am not going to fix this. I will let Reed fix these tests [Done] * Wrap detailed test tables is a "details" HTML block [Done] * Rename the test 'test_twip_twim' to 'test_bm_1_twoif_12_twip_2_twim_2_twif_5' to avoid confusion [Done] * Create new test 'twip_2_twim_2' for cdash_analyze_and_report.py to correctly check 'twip' and 'twim' that should not fail the glboal pass: - Create test that matches existing behavior that incorrectly says 'FAILED' and returns 1 instead of 0 [Done] - Change the test to what it should expect and will fail [Done] - Add var 'existanceTriggersGlobalFail' to class TestsetInfo for if existing tests in this category should trigger global fail (only 'twip' and 'twim' tests should not trigger a global failure) [Done] * Rename 'TestsetInfo' to 'TestsetTypeInfo' and 'testsetInfo' to 'testsetTypeIfo' [Done] * Remove tests for class TestsetsReporter (and add comment that it gets tested through the IssueTrackerTestStatusReporter class tests [Done] * Clean up commits on TriBITS branch 'tril-3887-test-summary-tables-refactor' and force push [Done] * Review changes on GitHub PR [Done] * Address few issues after my review [Done] * Run cdash_anayze_and_report.py drivers for real to make sure they still work [Done] * Merge the branch 'tril-3887-test-summary-tables-refactor' to 'master' with checkin-test.py script [Done] * Add comment to Trilinos GitHub #3887 [Done] * Add comment to https://gitlab-ex.sandia.gov/rmilewi/grover/-/issues/3 [Done] * Add todo to Trilinos GitHub #3887 to consider publishing the issue-by-issue HTML reports to some website and then put in a link to it from the commit to that page. Or, consider just attaching the formatted file with as issue-XXXX-test-status-YYYY-MM-DD.html.txt [Done]

bartlettroscoe commented 4 years ago

FYI: Meet with @rmmilewi yesterday to go over the current status of Grover towards the minimum viable product. There are the todos:

Use updated code with correct issue info being submitted (i.e. make sure that changes that I made are being used in all steps below)
Fix formatting problems with beginning <html> block.
Update testing mode for Grover to create a new comment for each issue tracker just as it would in production mode.
Do a manual running of the Grover to submit status (as @rmmilewi's GitHub account) for all of the current tracked tests to their GitHub issues
Set up a new GitHub entity account that will be run using an internal Jenkins job
Set up a new jenkins job that will submit to GitHub but initially create a new dummy Trilinos GitHub issue and use --dummyIssueId=<issueid> to submit the status of all of the tracked tests as new comments to the dummy Trilinos GitHub issue. (That will both validate that the GitHub authentication is working correctly and it will show, again, what the comments look like.)

jhux2 commented 4 years ago

@bartlettroscoe @rmmilewi Thanks for setting this up -- it looks to be really useful. Just wanted to mention a small typo: the Grover message is missing the word "are" just after "there":

Grover saw that there tests being tracked on CDash that are associated with this open issue.

bartlettroscoe commented 4 years ago

Just wanted to mention a small typo: the Grover message is missing the word "are" just after "there"

@jhux2, thanks for the catch! Should be fixed going forward.

bartlettroscoe commented 4 years ago

We now have finally deployed the first "Minimum Viable Product" that adds a single comment per open ATDM Trilinos GitHub issue as described in ATDV-365. The Jenkins project:

https://jenkins-son.sandia.gov/search/?q=grover_trilinosatdmstatus_update_trilinos_issues

is set up to run once a week at 7 AM on Mondays to post these comments. I sent out the following email to the Trilinos developers yesterday, and then I manually ran that above Jenkins project which resulting in 11 comments posted and the log output showed:

16:47:35 [main.py] issue #6126,             success=False,            message:The issue (#6126) is not open.                 Right now, Grover avoids publishing results to closed issues.
16:47:35 [main.py] issue #6246,             success=False,            message:The issue (#6246) is not open.                 Right now, Grover avoids publishing results to closed issues.
16:47:35 [main.py] issue #3862,             success=False,            message:The issue (#3862) is not open.                 Right now, Grover avoids publishing results to closed issues.
16:47:35 [main.py] issue #6882,             success=False,            message:The issue (#6882) is not open.                 Right now, Grover avoids publishing results to closed issues.
16:47:35 [main.py] issue #6540,             success=True,            message:None
16:47:35 [main.py] issue #6799,             success=True,            message:None
16:47:35 [main.py] issue #7778,             success=True,            message:None
16:47:35 [main.py] issue #3863,             success=True,            message:None
16:47:35 [main.py] issue #6790,             success=True,            message:None
16:47:35 [main.py] issue #6991,             success=False,            message:The issue (#6991) is not open.                 Right now, Grover avoids publishing results to closed issues.
16:47:35 [main.py] issue #7690,             success=False,            message:The issue (#7690) is not open.                 Right now, Grover avoids publishing results to closed issues.
16:47:35 [main.py] issue #6009,             success=True,            message:None
16:47:35 [main.py] issue #6216,             success=True,            message:None
16:47:35 [main.py] issue #6553,             success=True,            message:None
16:47:35 [main.py] issue #6455,             success=True,            message:None
16:47:35 [main.py] issue #5006,             success=True,            message:None
16:47:35 [main.py] issue #7089,             success=True,            message:None

and posted the comments:

The next step is to update the comments so they give suggestions on if the issue can be closed since the issues are addressed.

bartlettroscoe commented 3 years ago

Related to:

SEPW-215

github-actions[bot] commented 2 years ago

This issue has had no activity for 365 days and is marked for closure. It will be closed after an additional 30 days of inactivity. If you would like to keep this issue open please add a comment and/or remove the MARKED_FOR_CLOSURE label. If this issue should be kept open even with no activity beyond the time limits you can add the label DO_NOT_AUTOCLOSE. If it is ok for this issue to be closed, feel free to go ahead and close it. Please do not add any comments or change any labels or otherwise touch this issue unless your intention is to reset the inactivity counter for an additional year.

trilinos / Trilinos