Closed zepumph closed 1 year ago
So far I have been logging my pre-commit-hooks output by saving the output in my bash alias for it.
alias pch="( cd ~/Documents/Development/phetsims && (node.exe chipper/js/scripts/precommit-hook-multi.js)) 2>&1 | tee -a ~/pch-times.txt"
Let's also add output from other parts of the tooling, so we can track how long builds are taking, etc.
If we go with plaintext, it could look like:
# precommit hooks 10-13-2022 9:43am
- lint 23s CACHE MISS
- type check 1s CACHE HIT
- phet-io API check 15s CACHE MISS
# build gravity-and-orbits --brands=phet,phet-io 10-13-2022 9:43am
- lint 1s CACHE HIT
- type check 12s CACHE MISS
- phet brand output 22s
- phet-io brand output 24s
If we want JSON, it could look like:
[{
"taskname": "precommit-hooks",
"startTime": "Oct 13, 2022 9am",
"subtasks": {
"lint": { "time": 23, "cacheHit": false },
"type check": { "time": 23, "cacheHit": false }
}
}]
Or we would probably want subtasks in an array
[{
"taskname": "precommit-hooks",
"startTime": "Oct 13, 2022 9am",
"subtasks": [
{"taskname":"lint", "time": 23, "cacheHit": false },
{"taskname":"type check", "time": 23, "cacheHit": false }
]
},
{
"taskname": "build gravity-and-orbits --brands=phet,phet-io",
"startTime": "Oct 13, 2022 9am",
"subtasks": [
{"taskname":"lint", "time": 23, "cacheHit": false },
{"taskname":"type check", "time": 23, "cacheHit": false },
{"taskname":"phet brand output", "time": 23, "cacheHit": false },
{"taskname":"phet-io brand output", "time": 23, "cacheHit": false },
]
}
]
Plain text sounds easier to stream to it and more human-readable, but JSON sounds more machine readable-friendly. Not clear what's more important here, but I'm thinking we should probably start with JSON for ease of parsing and generating.
We should also consider how we want to look at these logs. Is it more like:
or
Summarizing questions: How do we use/aggregate/report the data? Should this be "always on", or "opt in"? Where should the log output? Syntax/format for output? Should we support infinitely nested sub-tasks?
I'm planning to work on this issue today
@jessegreenberg said on slack:
That is kind of what I was thinking [focus on human readability]. From a file like that it would be easy enough to sum/average times. I am fine with always on. Maybe save to chipper/data? Infinitely nested sub-tasks sounds more complicated but might help optimize if we decide that is necessary.
Some more considerations for this log:
startTask('name')
endTask()
or functional wrapTask( ()=>{})
? The latter seems more robust and less error-prone but also more verbose.hook-pre-commit: 6523
lint: 644 CACHE MISS
report-media: 12
tsc: 5850
tsc fake subtask: 1002
qunit: 4 CACHE HIT
phet-io-api-compare: 13
Current patch:
Here's an improved patch:
However, this strategy is incompatible with early exit (like process.exit) or even throwing errors, so we may need a different strategy. Or maybe try/catch the task?
Just jotting down an output style that can be streamed out, so would support early exiting and could also be used to see "what's happening currently" in a long process.
<hook-pre-commit>
<lint></lint>: 644
<report-media></report-media>: 12
<tsc>
<tsc fake subtask></tsc fake subtask>: 1002
</tsc>: 123
<qunit></qunit>: 4
<phet-io-api-compare></phet-io-api-compare>: 13
</hook-pre-commit>: 6523
Or
<hook-pre-commit>
<lint/>: 644
<report-media/>: 12
<tsc>
<tsc fake subtask/>: 1002
</tsc>: 123
<qunit/>: 4
<phet-io-api-compare/>: 13
</hook-pre-commit>: 6523
Here's an XML version:
<hook-pre-commit>
<lint/> <!-- 644ms -->
<report-media/> <!-- 12ms -->
<tsc>
<tsc-fake-subtask/> <!-- 1002ms -->
</tsc> <!-- 123ms -->
<qunit/> <!-- 4ms -->
<phet-io-api-compare/> <!-- 13ms -->
</hook-pre-commit> <!-- 6523ms -->
Current status:
UPDATE: even better patch.
Remaining work and questions for the reviewer:
grunt default
and build
steps need to flush the write stream?grunt default
?<!-- 10/16/2022, 10:45:37 AM -->
<hook-pre-commit repo="gravity-and-orbits">
<lint>
</lint> <!-- 554ms -->
<report-media>
</report-media> <!-- 12ms -->
<tsc>
</tsc> <!-- 4807ms -->
<qunit>
</qunit> <!-- 7962ms -->
<phet-io-api-compare>
</phet-io-api-compare> <!-- 5414ms -->
</hook-pre-commit repo="gravity-and-orbits"> <!-- 18749ms -->
But we could shorten it to:
<!-- 10/16/2022, 10:45:37 AM -->
<hook-pre-commit repo="gravity-and-orbits">
<lint/><!-- 554ms -->
<report-media/> <!-- 12ms -->
<tsc/> <!-- 4807ms -->
<qunit/> <!-- 7962ms -->
<phet-io-api-compare/> <!-- 5414ms -->
</hook-pre-commit repo="gravity-and-orbits"> <!-- 18749ms -->
repo=...
from the closing tag?~This is at a good checkpoint for Dev Meeting PSA.
This doesn't support multiple thread access very well, CT snapshots look like this:
<transpile>
</transpile> <!-- 9148ms -->
<build-brand-phet>
<webpack>
<transpile>
</transpile> <!-- 5440ms -->
<build-brand-phet>
<webpack>
</webpack> <!-- 9889ms -->
</webpack> <!-- 10292ms -->
</webpack> <!-- 9059ms -->
<minify-production>
<minify-production>
Looks like log rotation is not critical on CT for this file. Each snapshot log file was not too big, and they disappear with each snapshot.
By the way, there are some very disturbing numbers in the bayes CT phet-timing-log.
</build-brand-phet-io> <!-- 248430ms -->
That's over 4 minutes to build one brand of one sim.
I shared a PSA today. @kathy-phet asked @AgustinVallejo to review. Please see the checkboxes above for questions.
@samreid showed a demo of logs in perennial-alias/logs/phet-timing-log.txt. You can see how long processes take. It reports timing for several processes like tsc, lint, and grunt tasks.
There is documentation in the phet-timing-log about this and how to print output in real time.
@zepumph: How should we use this tool? I often have commit hooks that take longer than 100 seconds. What is an acceptable amount of time? @zepumph - sometimes to get around commit hook timing I commit locally without hooks and then run tests before pushing. @marlitas - I thought we shouldn't be making breaking changes even locally? @jonathanolson - Yes, it is better for checking out SHAs at a certain date to not have breaks committed to master. @jonathanolson - Have the commit hooks been successfully saving us from pushing breaking changes? @pixelzoom @jbphet @jessegreenberg - Yes! @marlitas - I think it would be nice to be able to make rapid commits locally or to a branch without waiting for commit hooks. @jbphet - You could make commits to a feature branch. Then when merge to master when the changes are complete. @samreid - Lets form a sub team for this. We have log data to look through now, and need a sub team to make decisions about this. @zepumph - I waited for built tools for 18.5 minutes yesterday. The time could be longer. Sometimes run without commit hooks. @jessegreenberg - I am going to try feature branches without commit hooks when working in a single repo for a bit! @kathy-phet - After the POSE convening we can have a sub-group discuss this.
I opened https://github.com/phetsims/chipper/issues/1350 about improving the precommit hook developer experience. This issue can remain specific to the timing logging.
Today @marlitas and I observed this output in the timing log:
~/apache-document-root/main$ tail -f ./perennial-alias/logs/phet-timing-log.txt
<hook-pre-commit repo="axon">
<lint>
<report-media>
<tsc>
<qunit>
<phet-io-api-compare>
</report-media> <!-- 533ms -->
</tsc> <!-- 620ms -->
</qunit> <!-- 8863ms -->
</hook-pre-commit repo="axon"> <!-- 23194ms -->
Note that lint never reported an end time. We think lint completed, but it did not flush the write buffer before exiting. We should check on that.
I'm not sure what caused that, since there is a flush command at the end of the hook-pre-commit:
// generatePhetioMacroAPI is preventing exit for unknown reasons, so manually exit here
phetTimingLog.close( () => process.exit( precommitSuccess ? 0 : 1 ) );
Maybe the reviewer can investigate and suggest possibilities.
Now that we have subbprocesses for our pre commit tasks, we need to add 2 things:
I updated the timing script to print much nicer, machine parsable text. It also doesn't have the start/end pieces anymore. We lose the ability to look into the logs at what is "currently" going, but we already kinda lost that when we started running the pre-commit-hook subtasks in parallel.
Now things look like:
<hook-pre-commit-perennial-alias date="1/10/2023, 11:05:17 AM">
<lint time="1042ms"/>
<qunit time="1036ms"/>
<phet-io-api-compare time="1057ms"/>
<report-media time="1094ms"/>
<tsc time="1369ms"/>
</hook-pre-commit-perennial-alias time="1389ms">
<hook-pre-commit repo="perennial-alias" date="1/10/2023, 11:05:50 AM">
<qunit time="1084ms"/>
<report-media time="1128ms"/>
<lint time="3231ms"/>
<tsc time="22616ms"/>
<phet-io-api-compare time="26748ms"/>
</hook-pre-commit repo="perennial-alias" time="26789ms">
<hook-pre-commit repo="perennial-alias" date="1/10/2023, 11:06:27 AM">
<qunit time="1075ms"/>
<report-media time="1150ms"/>
<lint time="3389ms"/>
<tsc time="22195ms"/>
<phet-io-api-compare time="25938ms"/>
</hook-pre-commit repo="perennial-alias" time="25984ms">
<hook-pre-commit repo="ratio-and-proportion" date="1/10/2023, 11:06:53 AM">
<phet-io-api-compare time="1123ms"/>
<report-media time="1226ms"/>
<qunit time="9386ms"/>
<tsc time="17299ms"/>
<lint time="20553ms"/>
</hook-pre-commit repo="ratio-and-proportion" time="20553ms">
Each task needs its "buffer flushed" before exiting from the last process run in hook-pre-commit.
I'm not sure what this means or what the bug is. Perhaps forget about it unless someone else can remember.
@samreid anything else here?
That looks better to me and more useful to me, but I'm somewhat concerned about the overlap between the buffer flushed issue and not reporting when a task starts. The symptom of the "buffer not flushed" issue is that the task exits before it has a chance to print to the console. So if there is a latent or spurious bug there, it would be difficult to detect.
In checking my phet-timing-log file, I see 1002 occurrences of <hook-pre-commit repo
and only 969 occurrences of <hook-pre-commit repo
. It could be that some of these are a failed process which never completed. But I wonder if some of these completed but due to a timing issue the process closed before the task could write the closing tag. Maybe it's best to manually inspect the code and make sure we are flushing the buffer before processes exit, and hope that is good enough? (And keep the improved formatting?)
I checked recent timing logs and everything seems OK. No further work for this issue. Thanks @zepumph! Closing.
From https://github.com/phetsims/chipper/issues/1325 today, we should investigate logging output from pre-commit hooks to help us analyze timing and problems, and to make improvements. Assigning @samreid and ME.