eu-nebulous / sal-scripts

Mozilla Public License 2.0
0 stars 0 forks source link

Failed to install EMS on test environment #13

Closed robert-sanfeliu closed 1 month ago

robert-sanfeliu commented 2 months ago

On test environment installation of EMS fails:

[2256t3@ip-172-31-38-34.ec2.internal;m10533-1-master-10533-1_Task_start_0;09:03:02] Error: INSTALLATION FAILED: failed to fetch https://jmarchel7bulls.github.io/helm-charts/ems-server-0.1.0.tgz : 404 Not Found
robert-sanfeliu commented 2 months ago

The error comes from: https://github.com/eu-nebulous/monitoring/blob/c608066d2b2172541a557613a31e2a7ab9ba4040/nebulous/ems-nebulous/src/main/resources/helm/epm-deploy.yml#L39

robert-sanfeliu commented 2 months ago

Issue also exists on nebulous-cd environment

yoctozepto commented 1 month ago

Hmm... Any Helm stuff should be only ever handled from sal-scripts (this repo). The actual installation of EMS server should be happening in:

https://github.com/eu-nebulous/sal-scripts/blob/4f0586ad27fc3aec89decf985316e90acee04dec/installation-scripts-onm/MASTER_START_SCRIPT.sh#L36-L43

yoctozepto commented 1 month ago

Based on the on-slack discussion with @ipatini, the script you mention, @robert-sanfeliu, is not invoked. Thus, the issues lies elsewhere. Most likely, SAL has not picked up the updated scripts. If they were updated in-place and SAL was not restarted, then it still uses the old ones. You could verify this by browsing the files SAL currently sees and/or comparing the timestamps on the configmap vs pod. (Strikethrough because it turned out not to be the case; see my next comment.)

yoctozepto commented 1 month ago

@robert-sanfeliu, @ipatini has pointed out that the failure was on the chart itself and not the repo. I have not backtracked far enough after discovering your analysis was a red herring. ;-) Thus, the following commit should fix the issue: https://github.com/eu-nebulous/helm-charts/commit/e17ec5620b42bf3e144abdcf96e32270a18e81ca

Please let me know if it is fixed and act on this issue report appropriately.

robert-sanfeliu commented 1 month ago

@yoctozepto my bad, I searched for this jmarchel7bulls on the repo and it only showed up one entry for the whole NebulOuS organisation. image

yoctozepto commented 1 month ago

@yoctozepto my bad, I searched for this jmarchel7bulls on the repo and it only showed up one entry for the whole NebulOuS organisation. image

That should not have been the case because that subdomain was clearly there before my fix.

Anyhow, please confirm if this helped.

robert-sanfeliu commented 1 month ago

@yoctozepto The issue is solved on nebulous-cd environment. Should any changes be made to the nebulous-test environment to apply the fix? I haven't tested it yet on that environment.

yoctozepto commented 1 month ago

@robert-sanfeliu Thanks for confirming. The only difference is that the -test env uses non-ONM scripts.

robert-sanfeliu commented 1 month ago

On July 5th I requested @jmarchel7bulls to deploy the latest version of everything (except CFSB) on nebulous-test and use ONM version of the SAL scripts. I think he did, but we can ask him when he is back. Wiki was not updated because I couldn't confirm that the environment was working (and that is why I opened this bug).

yoctozepto commented 1 month ago

@robert-sanfeliu You are right, the code is ONM-enabled but both k8s metadata and wiki claim otherwise. I will look into cleaning it up.