NASA-PDS / operations

Tickets for the PDSEN Operations Team
Other
5 stars 1 forks source link

Verify new machine for metrics #421

Open c-suh opened 10 months ago

c-suh commented 10 months ago

💡 Description

SAs have set up a replacement machine which needs to be verified before the current metrics machine is shut off.

c-suh commented 10 months ago

Note: This was started some time ago, and communications are via email. Currently, there is an error when spinning up a site, and I have let the SA know.

c-suh commented 10 months ago

SA fixed the error mentioned above and was able to run more testing, which revealed more issues, which the SA is working on.

c-suh commented 10 months ago

Issues for the front-end will not be resolved. Colleen is testing by running the metrics.

c-suh commented 10 months ago

Have left new request via email for the SA regarding NAIF logs.

c-suh commented 5 months ago

Starting process for January metrics to see how it runs. One initial find is that the EN logs are missing and I must ping the SAs about this.

tloubrieu-jpl commented 4 months ago

still doing that.

c-suh commented 4 months ago

January metrics didn't run properly. Changed configuration for dates and restarted, and it is still in the preprocess/update phase.

c-suh commented 3 months ago

Still running, but I think it's on the last database update (all_pds; a few days ago, it was on sbn_umd). I've located the configuration files and compared them (only one difference, which I've changed to match). Once January is finished, I will check in with Jordan about which factors to tweak (e.g., memory usage) and test with February.

tloubrieu-jpl commented 3 months ago

The process has been mistakenly interrupted. @c-suh will restrt with improved configuration for performance.

c-suh commented 3 months ago

Have discovered a number of problems with getting logs from various nodes, which is a mixture of it being from this new machine and changes on the nodes' ends. Have asked who to contact and will continue researching the machine issues.

c-suh commented 3 months ago

Have reached out to all nodes, addressed some of the issues, am awaiting answers from the SAs regarding a few others, and am researching the rest, at least one of which is probably due to the java upgrade.

c-suh commented 3 months ago

One more has been addressed; another to be tested.

c-suh commented 3 months ago

Have set up Kai's script for syncing logs although it is not yet automated or with the service account (ticket with the SAs has been created for this).

tloubrieu-jpl commented 2 months ago

Some progress on that.

c-suh commented 2 months ago

Remaining issues: