Open kltm opened 6 months ago
On a console review, I'm noticing a lot of late errors like:
03:27:00 + rsync -avz -e ssh -o StrictHostKeyChecking=no -o IdentitiesOnly=true -o IdentityFile=**** /opt/go-site/pipeline/target/blazegraph-production.jnl.gz skyhook@skyhook.berkeleybop.org:/home/skyhook/snapshot/products/blazegraph/
03:27:00 sending incremental file list
03:27:15 blazegraph-production.jnl.gz
03:27:15 deflate on token returned 0 (21379 bytes left)
03:27:15 rsync error: error in rsync protocol data stream (code 12) at token.c(481) [sender=3.2.7]
I'm going to try scp instead of rsync here for a bit.
(Noting that the internet says things like "need rsync on target", "need full path on target", and "need full path on ssh bin"; none really explain why it is intermittent.)
Shockingly got a pass here--not sure if changes or lucky. Try again with same set on release
.
The stop issue seems to be continuing. Reduced executors to 5.
We are currently having issues with regularly and quickly getting full-data runs out of the pipeline. This is seriously affecting
snapshot
andrelease
.As a bandaid to more long-term solutions (like pipeline refactoring and hardware purchasing), we're going to briefly experiment with limiting pipeline bandwidth (number of "workers") and increasing runtime resources for various parts.
This is a partial response to https://github.com/geneontology/pipeline/issues/316
Sending notice to @mugitty @sierra-moxon @dustine32
Tagging @pgaudet