ome / omero-documentation

Sphinx-based documentation for the Open Microscopy Environment
https://omero.readthedocs.io
BSD 2-Clause "Simplified" License
22 stars 51 forks source link

Weird issue with Omero #2219

Open Hamedhm opened 2 years ago

Hamedhm commented 2 years ago

Hi,

We have an Omero server V5.3.1 (I know it is old!) to upload 100K+ images, and have recently encountered some issues with it.

Before I start, the config file looks like below:

./omero config get
Ice.IPv6=0
Ice.MessageSizeMax=500000
omero=True
omero.data.dir=/XXXXX/omero_data_dev
omero.db.host=XXXX
omero.db.name=XXXXX
omero.db.pass=XXXXX
omero.db.user=XXXXX
omero.debug=False
omero.jvmcfg.max_system_memory=31000
omero.jvmcfg.percent.blitz=50
omero.jvmcfg.percent.indexer=20
omero.jvmcfg.percent.pixeldata=20
omero.jvmcfg.system_memory=30000
omero.web.application_host=XXXXX/media/omero:80/
omero.web.application_server=fastcgi-tcp
omero.web.debug=True
omero.web.public.enabled=True
omero.web.public.password=XXXXX
omero.web.public.url_filter=webgateway|webclient/annotation
omero.web.public.user=XXXXX
omero.web.static_url=/mi/media/static/

The issue is that the server quickly becomes unavailable. The log below is produced by the process that may have some signals to specify the issue:

2022-03-08 15:18:59,806 - OmeroService - INFO:dataset=YYYYYYY
!! 03/08/22 15:18:59.946 error: cannot create thread for timer:
   Thread.cpp:752: IceUtil::ThreadSyscallException:
   syscall exception: Resource temporarily unavailable
InternalException: Failed to connect: exception ::Ice::UnknownException
{
    unknown = Thread.cpp:752: IceUtil::ThreadSyscallException:
syscall exception: Resource temporarily unavailable
}
2022-03-08 15:18:59,946 - OmeroService - ERROR:assert failed
2022-03-08 15:18:59,963 - omero.gateway - WARNING:ConnectionLostException on <class 'omero.gateway.OmeroGatewaySafeCallWrapper'> to <3fc15a15-0fda-4ded-afad-485dabcbf5c6omero.api.IQuery> 
!! 03/08/22 15:19:00.125 error: cannot create thread for `Ice.ThreadPool.Client':
   Thread.cpp:752: IceUtil::ThreadSyscallException:
   syscall exception: Resource temporarily unavailable

Please have you seen the same issue before?

will-moore commented 2 years ago

Hi, Can you give some more details on the sizes (X, Y, Z, C, T) and formats of the images? And how are you importing them? Command-line, Insight client?

Hamedhm commented 2 years ago

thanks @will-moore. Please see the chart below: ImagesPipelineExtensions_hhm

joshmoore commented 2 years ago

Hi @Hamedhm. How long has this been happening? Did anything change at about the time it began? (e.g. a move to a new server, a configuration change, a new file type)

The error you're getting looks to me like a system resource issue. Without knowing more, I'd guess a file descriptor limit (ulimit -n).

Hamedhm commented 2 years ago

Hi @joshmoore

Thank you for your reply. No nothing changed as far as I know.

I guess the issue is that Omero eats up all memory. For example, I restarted the Omero a few hours ago and it is using 17.6GB of virtual memory already. Yesterday, it was around 80GB.

image

for the command ulimit -n I get 65000, not sure what this means

joshmoore commented 2 years ago

How much memory does this system have, @Hamedhm?

for the command ulimit -n I get 65000, not sure what this means

This is a reasonable setting, so :+1:

For example, I restarted the Omero a few hours ago and it is using 17.6GB of virtual memory already. Yesterday, it was around 80GB.

Interesting. Can you show the rest of that output? What python process is running under the java process with 1500m of memory.

Hamedhm commented 2 years ago

Thanks @joshmoore

The machine has 32GB of memory.

The screenshot below is for when the machine is idle (like when I restart the machine). There is one task running on the machine but really nothing much.

image

and

image

joshmoore commented 2 years ago

Hi @Hamedhm,

Historically, the Java virtual memory consumption can be taken with a grain of salt, but 80GB definitely sounds like an issue. Also noting the Java version (1.6) I must make the plea for an upgrade.

As a sidenote: I don't recognize this get_omero_ids_from_csv_file.py script. Do you know what is running it?