Closed samreid closed 3 years ago
I tried using shell scripting to accomplish this, but it appears nontrivial to capture output from background processes. I'd like to try Node.
I added a script (bottom result), it's about the same speed as pull-all.sh -p
(top result):
(3.364+3.417+3.504)/3
3.4283333333333332
(3.407+3.473+3.405)/3
3.4283333333333332
The output seems much nicer, but I'll test it over the next few business days to see if I want to make improvements. I'm also considering an option to pull-all without rebase or with stash push-pop. Or an interactive mode that iterates over repos that didn't pull and asks if you want to pull without rebase or try a push-pop.
I wrote to slack dev public:
Would a windows developer volunteer to be a guinea pig to try out some new “pull all” and “push all” scripts, next time you need to push or pull? If you are in the root of your checkouts, the commands are:
node perennial/js/scripts/pull-all.js node perennial/js/scripts/push-all.js
There are inherently parallel. Context is https://github.com/phetsims/perennial/issues/206
I work on a windows 10 machine and just tried both pull-all.js and push-all.js, it worked well! I did get an error about "credential storage lock" in one repo, but the commit was still pushed successfully. I just checked and I am seeing the same thing using pull-all.sh, so it probably isn't specific to these new scripts but maybe using git credentials in parallel.
$ node perennial/js/scripts/push-all.js
(node:29608) Warning: Accessing non-existent property 'padLevels' of module exports inside circular dependency
(Use `node --trace-warnings ...` to show where the warning was created)
chipper
remote:
remote: GitHub found 1 vulnerability on phetsims/chipper's default branch (1 moderate). To find out more, visit:
remote: https://github.com/phetsims/chipper/security/dependabot/package.json/jpeg-js/open
remote:
To https://github.com/phetsims/chipper.git
6141b516..c7c4cd8a master -> master
scenery
fatal: unable to get credential storage lock: File exists
To https://github.com/phetsims/scenery.git
4a37554a..a824e1b2 master -> master
sun
To https://github.com/phetsims/sun.git
3a54a2db..0bcd0609 master -> master
Already up to date.
phet-io-wrapper-haptics
Already up to date.
phet-io-wrapper-hookes-law-energy
Already up to date.
fatal: unable to get credential storage lock: File exists
phet-io-wrapper-lab-book
Already up to date.
phet-io-wrappers
Already up to date.
fatal: unable to get credential storage lock: File exists
phet-ios-app
Already up to date.
Similar to https://github.com/phetsims/perennial/issues/205#issuecomment-770065853, I was unable to use this js script:
Michael ~/PhET/git
$ node perennial/js/scripts/pull-all.js
(node:24128) Warning: Accessing non-existent property 'padLevels' of module exports inside circular dependency
(Use `node --trace-warnings ...` to show where the warning was created)
events.js:292
throw er; // Unhandled 'error' event
^
Error: spawn git ENOENT
at Process.ChildProcess._handle.onexit (internal/child_process.js:269:19)
at onErrorNT (internal/child_process.js:465:16)
at processTicksAndRejections (internal/process/task_queues.js:80:21)
Emitted 'error' event on ChildProcess instance at:
at Process.ChildProcess._handle.onexit (internal/child_process.js:275:12)
at onErrorNT (internal/child_process.js:465:16)
at processTicksAndRejections (internal/process/task_queues.js:80:21) {
errno: -4058,
code: 'ENOENT',
syscall: 'spawn git',
path: 'git',
spawnargs: [ 'pull', '--rebase' ]
}
Any ideas on why I have this but perhaps others don't
Michael ~/PhET/git
$ node -v
v14.15.0
Michael ~/PhET/git
$ npm -v
6.14.8
If we want to get it working on machines like @zepumph we could try _.chunk and running them as batches, but as long as he has a solution that works well for him, I'm not sure we need to expend the effort.
Also, thanks @jessegreenberg and @zepumph for testing!
This script works for me about 75% of the time for me right now, but sometimes it fails with batch timeout problems:
under-pressure
fatal: unable to access 'https://github.com/phetsims/under-pressure.git/': Failed to connect to github.com port 443: Timed out
unit-rates
fatal: unable to access 'https://github.com/phetsims/unit-rates.git/': Failed to connect to github.com port 443: Timed out
utterance-queue
fatal: unable to access 'https://github.com/phetsims/utterance-queue.git/': Failed to connect to github.com port 443: Timed out
vector-addition
fatal: unable to access 'https://github.com/phetsims/vector-addition.git/': Failed to connect to github.com port 443: Timed out
vector-addition-equations
fatal: unable to access 'https://github.com/phetsims/vector-addition-equations.git/': Failed to connect to github.com port 443: Timed out
vegas
fatal: unable to access 'https://github.com/phetsims/vegas.git/': Failed to connect to github.com port 443: Timed out
vibe
fatal: unable to access 'https://github.com/phetsims/vibe.git/': Failed to connect to github.com port 443: Timed out
wave-interference
fatal: unable to access 'https://github.com/phetsims/wave-interference.git/': Failed to connect to github.com port 443: Timed out
wave-on-a-string
fatal: unable to access 'https://github.com/phetsims/wave-on-a-string.git/': Failed to connect to github.com port 443: Timed out
waves-intro
fatal: unable to access 'https://github.com/phetsims/waves-intro.git/': Failed to connect to github.com port 443: Timed out
weddell
fatal: unable to access 'https://github.com/phetsims/weddell.git/': Failed to connect to github.com port 443: Timed out
wilder
fatal: unable to access 'https://github.com/phetsims/wilder.git/': Failed to connect to github.com port 443: Timed out
xray-diffraction
fatal: unable to access 'https://github.com/phetsims/xray-diffraction.git/': Failed to connect to github.com port 443: Timed out
yotta
fatal: unable to access 'https://github.com/phetsims/yotta.git/': Failed to connect to github.com port 443: Timed out
I would recommend adding in the _.chunk
functionality as an option, that defaults to not chunking (running them all at once). Then it could work well for me (on windows) to know to specify a chunk even if you don't have to.
--rebase
an option that is auto on. Sometimes it is nice to merge in some cases.I added the _.chunk strategy in this patch:
Here are the results from the timing on my system:
~/apache-document-root/main$ time pull 1
real 1m11.001s
~/apache-document-root/main$ time pull 4
real 0m20.314s
~/apache-document-root/main$ time pull 8
real 0m11.614s
~/apache-document-root/main$ time pull 16
real 0m6.860s
~/apache-document-root/main$ time pull 32
real 0m4.864s
~/apache-document-root/main$ time pull 64
real 0m3.912s
~/apache-document-root/main$ time pull 128
real 0m3.511s
~/apache-document-root/main$ time pull 256
real 0m3.345s
I should note that there were 2 failures where a git pull just stalled out--I haven't seen that symptom before, not sure if it was related to the patch or if there was a temporary web/github outage.
@zepumph would you be interested in checking the patch, trying increasing numbers of chunkSize values to see where it starts failing? Maybe we can pick a chunk value that is fast enough and not too crashy. Or if we need to use different chunkSizes on different platforms, we can leave that as an option.
No crashing this time! I think we have a winner. What do you think?
$ time node perennial/js/scripts/pull-all.js 1
real 4m26.220s
$ time node perennial/js/scripts/pull-all.js 2
real 7m10.628s
$ time node perennial/js/scripts/pull-all.js 8
real 1m17.600s
$ time node perennial/js/scripts/pull-all.js 32
real 0m19.917s
$ time node perennial/js/scripts/pull-all.js 64
real 0m21.323s
$ time node perennial/js/scripts/pull-all.js 128
real 0m12.711s
$ time node perennial/js/scripts/pull-all.js 256
real 0m11.158s
That said, @samreid recommended trying again without the patch, and there were no problems time was:
10.8 14.5 12.2 17.9
So I guess I have no opinion. Seems like the next step for me is to just use this instead of the bash script, and I can report errors if they are systemic.
I've been using pull-all.js for the past week with great enjoyment and success. Today I can't run it without getting this error:
Michael ~/PHET/git/chipper (master)
$ pa
(node:17704) Warning: Accessing non-existent property 'padLevels' of module exports inside circular dependency
(Use `node --trace-warnings ...` to show where the warning was created)
events.js:292
throw er; // Unhandled 'error' event
^
Error: spawn git ENOENT
at Process.ChildProcess._handle.onexit (internal/child_process.js:269:19)
at onErrorNT (internal/child_process.js:465:16)
at processTicksAndRejections (internal/process/task_queues.js:80:21)
Emitted 'error' event on ChildProcess instance at:
at Process.ChildProcess._handle.onexit (internal/child_process.js:275:12)
at onErrorNT (internal/child_process.js:465:16)
at processTicksAndRejections (internal/process/task_queues.js:80:21) {
errno: -4058,
code: 'ENOENT',
syscall: 'spawn git',
path: 'git',
spawnargs: [ 'pull', '--rebase' ]
}
This was because I hadn't pulled installer-builder since it was archived/deleted. It would be nice to get better error handling. I'll look into it.
This patch demonstrates the problem, and works towards catching the error. Perhaps we can work to improve the error message in pull-all.js. I'm also happy to ignore. I just hate how general the ENOENT error is, and would prefer to help future users of this script. @samreid do you have any ideas on how to improve?
Index: js/scripts/pull-all.js
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/js/scripts/pull-all.js b/js/scripts/pull-all.js
--- a/js/scripts/pull-all.js (revision c7afc6bbfb84ce4b1a5fbc6580ed8fd52a0e619f)
+++ b/js/scripts/pull-all.js (date 1614107839329)
@@ -7,7 +7,7 @@
// constants
// Don't use getActiveRepos() since it cannot be run from the root
const contents = fs.readFileSync( 'perennial/data/active-repos', 'utf8' ).trim();
-const repos = contents.split( '\n' ).map( sim => sim.trim() );
+const repos = [ 'jfkdslafds' ];
/**
* Pulls all repos (with rebase)
@@ -20,12 +20,20 @@
*/
( async () => {
- const a = repos.map( repo => execute( 'git', [ 'pull', '--rebase' ], `${repo}`, {
+
+ let out;
+ try {
+ const a = repos.map( repo => execute( 'git', [ 'pull', '--rebase' ], `${repo}`, {
- // resolve errors so Promise.all doesn't fail on first repo that cannot pull/rebase
- errors: 'resolve'
- } ) );
- const out = await Promise.all( a );
+ // resolve errors so Promise.all doesn't fail on first repo that cannot pull/rebase
+ errors: 'resolve'
+ } ) );
+
+ out = await Promise.all( a );
+ }
+ catch( e ) {
+ return console.log( e );
+ }
// Report results
for ( let i = 0; i < a.length; i++ ) {
Index: js/common/execute.js
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/js/common/execute.js b/js/common/execute.js
--- a/js/common/execute.js (revision c7afc6bbfb84ce4b1a5fbc6580ed8fd52a0e619f)
+++ b/js/common/execute.js (date 1614107746329)
@@ -76,21 +76,29 @@
stdout += data;
} );
+ let rejectedByError = false;
+ process.on( 'error', error => {
+ rejectedByError = true;
+ reject( new Error( `Error in executing spawned Process: ${error}` ) );
+ } );
+
process.on( 'close', code => {
winston.debug( `Command ${cmd} finished. Output is below.` );
winston.debug( stderr && `stderr: ${stderr}` || 'stderr is empty.' );
winston.debug( stdout && `stdout: ${stdout}` || 'stdout is empty.' );
- if ( options.errors === 'resolve' ) {
- resolve( { code: code, stdout: stdout, stderr: stderr } );
- }
- else {
- if ( code !== 0 ) {
- reject( new ExecuteError( cmd, args, cwd, stdout, stderr, code ) );
- }
- else {
- resolve( stdout );
+ if ( !rejectedByError ) {
+ if ( options.errors === 'resolve' ) {
+ resolve( { code: code, stdout: stdout, stderr: stderr } );
+ }
+ else {
+ if ( code !== 0 ) {
+ reject( new ExecuteError( cmd, args, cwd, stdout, stderr, code ) );
+ }
+ else {
+ resolve( stdout );
+ }
}
}
} );
That proposal looks good to me, but I recommend the following:
reject
calls be more aligned? One is:
reject( new ExecuteError( cmd, args, cwd, stdout, stderr, code ) );
And one is
reject( new Error( `Error in executing spawned Process: ${error}` ) );
Those changes look like it wouldn't give a consistent rejection type. I'm curious what the purpose of the patch is, or why it would be needed?
I found that there was a class of error that wasn't being handled by execute
, in which the spawning of the child process itself threw an error, rather than the process that was spawned (which ExecuteError
handles nicely).
In my particular example in pull-all.sh, active-repos had a repo that had just been deleted (installer-builder), and pull-all tried to execute git pull
inside installer-builder as the cwd. This fails to spawn correctly, but the error was really hard to understand, and so the above patch tries to support the case where we can get a helpful error out when we try to execute something bogus that won't even spawn.
Ok, sounds good to handle. I'd like to ensure that we have consistent reject types (because code COULD be wanting to parse things out). Sounds like there won't be an exit code to debug? Is 'close' guaranteed to NOT be invoked? Or could we JUST improve the error message and then use the same error handling code?
Here's an updated patch (other one has conflicts)
Also, I cannot find process.on('error')
documented anywhere in https://nodejs.org/api/process.html, so I have some questions about that.
I recommend the following changes for execute.js.
message
new ExecuteResult
.message
I am unfamiliar with the usages of execute
that expect a string to be returned, or expect an Error
in rejection. @jonathanolson does this seem like a feasible change?
I was just bit again by the lack of error handling via process.on( 'error' . . .
because my second copy of git didn't have all the repos in active-repos. I'm going to commit a callback for process.on
"error" now.
I again can't figure out a way to get a useful error from child process. I'm going to give up for now.
There is some motion on this problem in https://github.com/phetsims/perennial/issues/220#issuecomment-877464043. https://github.com/phetsims/perennial/issues/206#issuecomment-793031078 still seems like a good idea though.
I moved https://github.com/phetsims/perennial/issues/206#issuecomment-793031078 to new issue https://github.com/phetsims/perennial/issues/231 and there is a proposed fix for the ENOENT in https://github.com/phetsims/perennial/issues/220#issuecomment-877464043. Let's close this issue and continue in the related issues.
One of my most-used scripts is
pull-all.sh -p
, which works great, but it does not output the results in a friendly way. For instance, here is a result from a recent run:Generally, it is impossible to tell which sims were updated, which have local changes and which were up-to-date. I'd like to work on making this more legible.