microsoft / AzureTRE

An accelerator to help organizations build Trusted Research Environments on Azure.
https://microsoft.github.io/AzureTRE
MIT License
184 stars 143 forks source link

Gitea container crashes and restarts #4024

Open jonnyry opened 4 months ago

jonnyry commented 4 months ago

Using Gitea workspace service 1.0.2 on Azure TRE 0.18.0:

(crashes with similar errors also seen with the Gitea shared service)

App is functional, though crashes goes offline, and then restarts - crash usually happens at first startup, but then can happen intermittently after that with no predictable trigger.

Suspicious that it might be related to the Azure Files mount - with it being an SMB mount and not a local disk - but don't have anything directly pointing to that.

image

tim-allen-ck commented 4 months ago

Any chance you can get that log file from the container?

jonnyry commented 3 months ago

Yes, please see attached:

gitea-shared-service-log-19-07-2024.txt

jonnyry commented 3 months ago

I've switched the file storage to an NFS mount (Storage Account Premium tier), rather than using the default SMB mount, and seeing whether that makes a difference.

Few links that hint to the problem -

From Mount Azure Storage as a local share in App Service:

It isn't recommended to use storage mounts for local databases (such as SQLite) or for any other applications and components that rely on file handles and locks.

And from Gitea docs, which indicate locks are being used:

In most cases, it's caused by broken NFS lock system

tim-allen-ck commented 2 months ago

@jonnyry any update on this, did you manage to get anywhere

jonnyry commented 1 month ago

@tim-allen-ck I did not unfortunately... I ran out of time.

When I come back to this, I'm thinking of running Gitea on a VM instead as it will hopefully resolve the lock & stability issue, and provider better configurability over port mapping and logging - was also struggling with the restrictions the Azure Web App Service was placing on it around ports, which meant I was not able to SSH into the container.