Open c-suh opened 10 months ago
Note: This was started some time ago, and communications are via email. Currently, there is an error when spinning up a site, and I have let the SA know.
SA fixed the error mentioned above and was able to run more testing, which revealed more issues, which the SA is working on.
Issues for the front-end will not be resolved. Colleen is testing by running the metrics.
Have left new request via email for the SA regarding NAIF logs.
Starting process for January metrics to see how it runs. One initial find is that the EN logs are missing and I must ping the SAs about this.
still doing that.
January metrics didn't run properly. Changed configuration for dates and restarted, and it is still in the preprocess/update phase.
Still running, but I think it's on the last database update (all_pds; a few days ago, it was on sbn_umd). I've located the configuration files and compared them (only one difference, which I've changed to match). Once January is finished, I will check in with Jordan about which factors to tweak (e.g., memory usage) and test with February.
The process has been mistakenly interrupted. @c-suh will restrt with improved configuration for performance.
Have discovered a number of problems with getting logs from various nodes, which is a mixture of it being from this new machine and changes on the nodes' ends. Have asked who to contact and will continue researching the machine issues.
Have reached out to all nodes, addressed some of the issues, am awaiting answers from the SAs regarding a few others, and am researching the rest, at least one of which is probably due to the java upgrade.
One more has been addressed; another to be tested.
Have set up Kai's script for syncing logs although it is not yet automated or with the service account (ticket with the SAs has been created for this).
Some progress on that.
Remaining issues:
💡 Description
SAs have set up a replacement machine which needs to be verified before the current metrics machine is shut off.