Open will-moore opened 9 months ago
As suggested, I tried instead to move data initially, then delete later. Left this running in a screen... But it seems this was actually copying data from one mounted location to another?
[wmoore@test120-omeroreadwrite ~]$ sudo mkdir /BFCacheToDelete2
[wmoore@test120-omeroreadwrite ~]$ sudo mv /data/BioFormatsCache/data/OMERO/ManagedRepository/demo_2 /BFCacheToDelete2
...
mv: cannot create directory ‘/BFCacheToDelete2/demo_2/Blitz-0-Ice.ThreadPool.Server-165’: No space left on device
mv: cannot create directory ‘/BFCacheToDelete2/demo_2/Blitz-0-Ice.ThreadPool.Server-199’: No space left on device
mv: cannot create directory ‘/BFCacheToDelete2/demo_2/Blitz-0-Ice.ThreadPool.Server-179’: No space left on device
mv: cannot create directory ‘/BFCacheToDelete2/demo_2/Blitz-0-Ice.ThreadPool.Server-202’: No space left on device
mv: cannot create directory ‘/BFCacheToDelete2/demo_2/Blitz-0-Ice.ThreadPool.Server-180’: No space left on device
mv: cannot create directory ‘/BFCacheToDelete2/demo_2/Blitz-0-Ice.ThreadPool.Server-200’: No space left on device
mv: cannot create directory ‘/BFCacheToDelete2/demo_2/Blitz-0-Ice.ThreadPool.Server-206’: No space left on device
Restarted $ sudo rm -rf /data/BioFormatsCache/*
in a screen...
Meanwhile cleanup, $ sudo rm -rf /BFCacheToDelete2
Seems that parallel
wasn't installed on idr-testing...
[wmoore@test120-proxy ~]$ which parallel
/usr/bin/which: no parallel in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/wmoore/.local/bin:/home/wmoore/bin)
[wmoore@test120-proxy ~]$ sudo yum install https://dl.fedoraproject.org/pub/epel/7/x86_64/Packages/p/parallel-20160222-1.el7.noarch.rpm
...
[wmoore@test120-proxy ~]$ which parallel
/usr/local/bin/parallel
Try again..
screen -dmS cache parallel --eta --sshloginfile nodes -a ids.txt --results /tmp/cache/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
screen -r
Computers / CPU threads / Max jobs to run
1:omeroreadonly-1 / 1 / 10
2:omeroreadonly-2 / 1 / 10
3:omeroreadonly-3 / 1 / 10
4:omeroreadonly-4 / 1 / 10
5:omeroreadwrite / 1 / 10
Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
ETA: 21698s Left: 21691 AVG: 2.88s omeroreadonly-1:10/2/20%/29.0s omeroreadonly-2:10/5/25%/11.6s omeroreadonly-3:10/0/17%/0.0s omeroreadonly-4:10/1/18%/58.0s omeroreadwrite:10/0/17%/0.0s
Seems pretty unhappy... Lots of DatabaseBusyException
...
[wmoore@test120-proxy ~]$ grep Error /tmp/cache/**/**/* | wc
91 712 14456
[wmoore@test120-proxy ~]$ grep Error /tmp/cache/**/**/*
...
/tmp/cache/1/Image:608472/stderr:FileNotFoundError: [Errno 2] No such file or directory: path('/home/wmoore/omero/sessions/localhost/public/5e2e34a1-4b10-49e8-a07d-132d164df892')
/tmp/cache/1/Image:608856/stderr:FileNotFoundError: [Errno 2] No such file or directory: path('/home/wmoore/omero/sessions/localhost/public/5e2e34a1-4b10-49e8-a07d-132d164df892')
/tmp/cache/1/Image:610008/stderr:FileNotFoundError: [Errno 2] No such file or directory: path('/home/wmoore/omero/sessions/localhost/public/5e2e34a1-4b10-49e8-a07d-132d164df892')
/tmp/cache/1/Image:63089/stderr:FileExistsError: [Errno 17] File exists: path('/home/wmoore/omero/tmp')
/tmp/cache/1/Image:63089/stderr:FileExistsError: [Errno 17] File exists: path('/tmp/omero/tmp')
/tmp/cache/1/Image:63089/stderr:Error loading: /opt/omero/server/venv3/lib64/python3.6/site-packages/omero/plugins/_metadata_deprecated.py
/tmp/cache/1/Image:667768/stderr:FileNotFoundError: [Errno 2] No such file or directory: path('/home/wmoore/omero/sessions/localhost/public/05ee3d02-7fd0-4ec7-ad69-70692dc7bb7e')
/tmp/cache/1/Image:68465/stderr:FileNotFoundError: [Errno 2] No such file or directory: path('/home/wmoore/omero/sessions/localhost/public/577d172f-af4a-43b6-a5af-2f2ce42beef1')
/tmp/cache/1/Image:69233/stderr:FileNotFoundError: [Errno 2] No such file or directory: path('/home/wmoore/omero/sessions/localhost/public/577d172f-af4a-43b6-a5af-2f2ce42beef1')
/tmp/cache/1/Image:693144/stderr:FileNotFoundError: [Errno 2] No such file or directory: path('/home/wmoore/omero/sessions/localhost/public/97488570-6c46-4eec-8874-12ee239091ff')
...
[wmoore@test120-proxy ~]$ cat /tmp/cache/1/Image:98291/stderrr
cat: /tmp/cache/1/Image:98291/stderrr: No such file or directory
[wmoore@test120-proxy ~]$ cat /tmp/cache/1/Image:98291/stderr
Traceback (most recent call last):
File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero/plugins/sessions.py", line 622, in attach
rv = store.attach(server, name, uuid, set_current=set_current)
File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero/util/sessions.py", line 349, in attach
set_current=set_current)
File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero/util/sessions.py", line 385, in create
ec = sf.getAdminService().getEventContext()
File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero_api_IAdmin_ice.py", line 2655, in getEventContext
return _M_omero.api.IAdmin._op_getEventContext.invoke(self, ((), _ctx))
omero.DatabaseBusyException: exception ::omero::DatabaseBusyException
{
serverStackTrace = ome.conditions.DatabaseBusyException: cannot create transaction
[wmoore@test120-proxy ~]$ grep DatabaseBusyException /tmp/cache/**/**/* | wc
907 10422 135925
Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
ETA: 46524s Left: 20435 AVG: 2.30s omeroreadonly-1:10/284/22%/10.4s omeroreadonly-2:10/146/11%/20.2s omeroreadonly-3:10/63/5%/46.7s omeroreadonly-4:10/141/11%/20.9s omeroreadwrite:10/630/48%/4.7s
Cancelled:
[wmoore@test120-proxy ~]$ screen -X -S 2497.cache quit
Restarting...
for server in omeroreadwrite omeroreadonly-1 omeroreadonly-2 omeroreadonly-3 omeroreadonly-4; do ssh $server "sudo service omero-server restart"; done
Re-running...
[wmoore@test120-proxy ~]$ rm -rf /tmp/cache/
[wmoore@test120-proxy ~]$ screen -dmS cache parallel --eta --sshloginfile nodes -a ids.txt --results /tmp/cache/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force
'
[wmoore@test120-proxy ~]$ screen -r 3951.cache
...
Computers / CPU threads / Max jobs to run
1:omeroreadonly-1 / 1 / 10
2:omeroreadonly-2 / 1 / 10
3:omeroreadonly-3 / 1 / 10
4:omeroreadonly-4 / 1 / 10
5:omeroreadwrite / 1 / 10
Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
ETA: 13701s Left: 16084 AVG: 0.88s omeroreadonly-1:10/458/8%/10.8s omeroreadonly-2:10/401/7%/12.4s omeroreadonly-3:10/322/5%/15.4s omeroreadonly-4:10/440/7%/11.3s omeroreadwrite:10/3994/70%/1.2s
[wmoore@test120-proxy ~]$ cat /tmp/cache/1/Image:10842059/stderr
usage: /opt/omero/server/venv3/bin/omero render test [-h] [--force] [--thumb]
object
/opt/omero/server/venv3/bin/omero render test: error: too few arguments
/tmp/par-job-5089_BCnlw: line 3: Image:10842059: command not found
Ooops - added newline in command above. Cancel and start again...
[wmoore@test120-proxy ~]$ rm -rf /tmp/cache/
[wmoore@test120-proxy ~]$ screen -dmS cache parallel --eta --sshloginfile nodes -a ids.txt --results /tmp/cache/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
Looks like running OK now...
[wmoore@test120-proxy ~]$ grep ok /tmp/cache/**/**/*
...
/tmp/cache/1/Image:62321/stdout:ok: 62321 70.78979206085205
/tmp/cache/1/Image:63089/stdout:ok: 63089 202.19777274131775
/tmp/cache/1/Image:63857/stdout:ok: 63857 2.0913946628570557
/tmp/cache/1/Image:64625/stdout:ok: 64625 1.5413122177124023
/tmp/cache/1/Image:65393/stdout:ok: 65393 1.3958213329315186
[wmoore@test120-proxy ~]$ grep ok /tmp/cache/**/**/* | wc
50 150 3077
[wmoore@test120-proxy ~]$ grep Error /tmp/cache/**/**/*
/tmp/cache/1/Image:38517/stdout:fail: 38517 14.241586923599243 exception ::omero::ResourceError
/tmp/cache/1/Image:40053/stdout:fail: 40053 13.972737312316895 exception ::omero::ResourceError
As discussed in IDR meeting now, decided to stop memo generation now and test with ~2000 filesets that have been processed so far.
[wmoore@test120-proxy ~]$ screen -X -S 14861.cache quit
Exception on omeroreadonly-4
2024-02-15 09:53:44,486 INFO [ org.perf4j.TimingLogger] (.Server-31) start[1707990819611] time[4874] tag[omero.call.success.ome.logic.QueryImpl.findAllByQuery]
2024-02-15 09:53:44,486 INFO [ ome.services.util.ServiceHandler] (.Server-31) Rslt: (ome.model.core.Image:Id_10496304)
2024-02-15 09:53:44,486 INFO [ ome.security.basic.BasicEventContext] (.Server-26) cctx: group=-1
2024-02-15 09:53:44,492 INFO [ ome.services.util.ServiceHandler] (.Server-38) Meth: interface ome.api.IQuery.findAllByQuery
2024-02-15 09:53:44,492 INFO [ ome.services.util.ServiceHandler] (.Server-38) Args: [select i from Image as i left outer join i.wellSamples as ws join ws.well as well join well.plate as, PARAMS:F(o0l1):tag_text=Study Example Image id
=2501 ]
2024-02-15 09:53:44,495 INFO [ ome.security.basic.EventHandler] (.Server-26) Auth: user=52,group=-1,event=null(User),sess=b3bff02f-85c2-4591-bbf0-5b73d2f11367
2024-02-15 09:53:44,647 INFO [ org.perf4j.TimingLogger] (.Server-36) start[1707990819845] time[4801] tag[omero.call.exception]
2024-02-15 09:53:44,647 WARN [ ome.services.util.ServiceHandler] (.Server-36) Unknown exception thrown.
org.springframework.dao.DataAccessResourceFailureException: Hibernate operation: could not execute query; SQL [select image0_.id as id60_, image0_.acquisitionDate as acquisit2_60_, image0_.archived as archived60_, image0_.description as descript
4_60_, image0_.creation_id as creation10_60_, image0_.external_id as external11_60_, image0_.group_id as group12_60_, image0_.owner_id as owner13_60_, image0_.permissions as permissi5_60_, image0_.update_id as update14_60_, image0_.experiment as
experiment60_, image0_.fileset as fileset60_, image0_.format as format60_, image0_.imagingEnvironment as imaging18_60_, image0_.instrument as instrument60_, image0_.name as name60_, image0_.objectiveSettings as objecti20_60_, image0_.partial as
partial60_, image0_.series as series60_, image0_.stageLabel as stageLabel60_, image0_.version as version60_ from image image0_ left outer join wellsample wellsample1_ on image0_.id=wellsample1_.image and
(
1 = ? OR
1 = ? OR
(wellsample1_.group_id in (?)) OR
(wellsample1_.owner_id = ? AND (select (__g.permissions & 1024) = 1024 from experimentergroup __g where __g.id = wellsample1_.group_id)) OR
(wellsample1_.group_id in (?,?) AND (select (__g.permissions & 64) = 64 from experimentergroup __g where __g.id = wellsample1_.group_id)) OR
((select (__g.permissions & 4) = 4 from experimentergroup __g where __g.id = wellsample1_.group_id))
) inner join well well2_ on wellsample1_.well=well2_.id inner join plate plate3_ on well2_.plate=plate3_.id left outer join screenplatelink screenlink4_ on plate3_.id=screenlink4_.child and
(
1 = ? OR
1 = ? OR
(screenlink4_.group_id in (?)) OR
(screenlink4_.owner_id = ? AND (select (__g.permissions & 1024) = 1024 from experimentergroup __g where __g.id = screenlink4_.group_id)) OR
(screenlink4_.group_id in (?,?) AND (select (__g.permissions & 64) = 64 from experimentergroup __g where __g.id = screenlink4_.group_id)) OR
((select (__g.permissions & 4) = 4 from experimentergroup __g where __g.id = screenlink4_.group_id))
) inner join screen screen5_ on screenlink4_.parent=screen5_.id left outer join imageannotationlink annotation6_ on image0_.id=annotation6_.parent and
(
1 = ? OR
1 = ? OR
(annotation6_.group_id in (?)) OR
(annotation6_.owner_id = ? AND (select (__g.permissions & 1024) = 1024 from experimentergroup __g where __g.id = annotation6_.group_id)) OR
(annotation6_.group_id in (?,?) AND (select (__g.permissions & 64) = 64 from experimentergroup __g where __g.id = annotation6_.group_id)) OR
((select (__g.permissions & 4) = 4 from experimentergroup __g where __g.id = annotation6_.group_id))
) inner join annotation annotation7_ on annotation6_.child=annotation7_.id where
(
1 = ? OR
1 = ? OR
(image0_.group_id in (?)) OR
(image0_.owner_id = ? AND (select (__g.permissions & 1024) = 1024 from experimentergroup __g where __g.id = image0_.group_id)) OR
(image0_.group_id in (?,?) AND (select (__g.permissions & 64) = 64 from experimentergroup __g where __g.id = image0_.group_id)) OR
:
((select (__g.permissions & 4) = 4 from experimentergroup __g where __g.id = image0_.group_id))
) and screen5_.id=? and annotation7_.textValue=? order by well2_."column", well2_."row"]; ERROR: remaining connection slots are reserved for non-replication superuser connections
Where: parallel worker; nested exception is org.postgresql.util.PSQLException: ERROR: remaining connection slots are reserved for non-replication superuser connections
Where: parallel worker
at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:251)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73)
at org.springframework.orm.hibernate3.HibernateAccessor.convertJdbcAccessException(HibernateAccessor.java:428)
at org.springframework.orm.hibernate3.HibernateAccessor.convertHibernateAccessException(HibernateAccessor.java:414)
at org.springframework.orm.hibernate3.HibernateTemplate.doExecute(HibernateTemplate.java:416)
at org.springframework.orm.hibernate3.HibernateTemplate.execute(HibernateTemplate.java:342)
at ome.logic.QueryImpl.execute(QueryImpl.java:176)
at ome.logic.QueryImpl.findAllByQuery(QueryImpl.java:419)
at jdk.internal.reflect.GeneratedMethodAccessor597.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:333)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:190)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at ome.security.basic.EventHandler.invoke(EventHandler.java:154)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.orm.hibernate3.HibernateInterceptor.invoke(HibernateInterceptor.java:119)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:99)
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:282)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:96)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at ome.tools.hibernate.ProxyCleanupFilter$Interceptor.invoke(ProxyCleanupFilter.java:249)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at ome.services.util.ServiceHandler.invoke(ServiceHandler.java:121)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213)
at com.sun.proxy.$Proxy111.findAllByQuery(Unknown Source)
at jdk.internal.reflect.GeneratedMethodAccessor597.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:333)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:190)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at ome.security.basic.BasicSecurityWiring.invoke(BasicSecurityWiring.java:93)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at ome.services.blitz.fire.AopContextInitializer.invoke(AopContextInitializer.java:43)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213)
at com.sun.proxy.$Proxy111.findAllByQuery(Unknown Source)
at jdk.internal.reflect.GeneratedMethodAccessor729.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at ome.services.blitz.util.IceMethodInvoker.invoke(IceMethodInvoker.java:172)
:
at ome.services.throttling.Callback.run(Callback.java:56)
at ome.services.throttling.InThreadThrottlingStrategy.callInvokerOnRawArgs(InThreadThrottlingStrategy.java:56)
at ome.services.blitz.impl.AbstractAmdServant.callInvokerOnRawArgs(AbstractAmdServant.java:140)
at ome.services.blitz.impl.QueryI.findAllByQuery_async(QueryI.java:64)
at jdk.internal.reflect.GeneratedMethodAccessor728.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:333)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:190)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at omero.cmd.CallContext.invoke(CallContext.java:85)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213)
at com.sun.proxy.$Proxy112.findAllByQuery_async(Unknown Source)
at omero.api._IQueryTie.findAllByQuery_async(_IQueryTie.java:90)
at omero.api._IQueryDisp.___findAllByQuery(_IQueryDisp.java:594)
at omero.api._IQueryDisp.__dispatch(_IQueryDisp.java:733)
at IceInternal.Incoming.invoke(Incoming.java:221)
at Ice.ConnectionI.invokeAll(ConnectionI.java:2536)
at Ice.ConnectionI.dispatch(ConnectionI.java:1145)
at Ice.ConnectionI.message(ConnectionI.java:1056)
at IceInternal.ThreadPool.run(ThreadPool.java:395)
at IceInternal.ThreadPool.access$300(ThreadPool.java:12)
at IceInternal.ThreadPool$EventHandlerThread.run(ThreadPool.java:832)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.postgresql.util.PSQLException: ERROR: remaining connection slots are reserved for non-replication superuser connections
Where: parallel worker
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2422)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2167)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:306)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365)
at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:155)
at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:118)
at bitronix.tm.resource.jdbc.proxy.PreparedStatementJavaProxy.executeQuery(PreparedStatementJavaProxy.java:102)
at org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:208)
at org.hibernate.loader.Loader.getResultSet(Loader.java:1953)
at org.hibernate.loader.Loader.doQuery(Loader.java:802)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:274)
at org.hibernate.loader.Loader.doList(Loader.java:2542)
at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2276)
at org.hibernate.loader.Loader.list(Loader.java:2271)
at org.hibernate.loader.hql.QueryLoader.list(QueryLoader.java:459)
at org.hibernate.hql.ast.QueryTranslatorImpl.list(QueryTranslatorImpl.java:365)
at org.hibernate.engine.query.HQLQueryPlan.performList(HQLQueryPlan.java:196)
at org.hibernate.impl.SessionImpl.list(SessionImpl.java:1268)
at org.hibernate.impl.QueryImpl.list(QueryImpl.java:102)
at ome.services.query.Query.doInHibernate(Query.java:253)
:
at org.springframework.orm.hibernate3.HibernateTemplate.doExecute(HibernateTemplate.java:411)
... 64 common frames omitted
2024-02-15 09:53:44,648 INFO [ ome.services.util.ServiceHandler] (.Server-36) Excp: org.springframework.dao.DataAccessResourceFailureException: Hibernate operation: could not execute query; SQL [select image0_.id as id60_, image0_.acquisitionDate as acquisit2_60_, image0_.archived as archived60_, image0_.description as descript4_60_, image0_.creation_id as creation10_60_, image0_.external_id as external11_60_, image0_.group_id as group12_60_, image0_.owner_id as owner13_60_, image0_.permissions as permissi5_60_, image0_.update_id as update14_60_, image0_.experiment as experiment60_, image0_.fileset as fileset60_, image0_.format as format60_, image0_.imagingEnvironment as imaging18_60_, image0_.instrument as instrument60_, image0_.name as name60_, image0_.objectiveSettings as objecti20_60_, image0_.partial as partial60_, image0_.series as series60_, image0_.stageLabel as stageLabel60_, image0_.version as version60_ from image image0_ left outer join wellsample wellsample1_ on image0_.id=wellsample1_.image and
(
1 = ? OR
1 = ? OR
(wellsample1_.group_id in (?)) OR
(wellsample1_.owner_id = ? AND (select (__g.permissions & 1024) = 1024 from experimentergroup __g where __g.id = wellsample1_.group_id)) OR
(wellsample1_.group_id in (?,?) AND (select (__g.permissions & 64) = 64 from experimentergroup __g where __g.id = wellsample1_.group_id)) OR
((select (__g.permissions & 4) = 4 from experimentergroup __g where __g.id = wellsample1_.group_id))
) inner join well well2_ on wellsample1_.well=well2_.id inner join plate plate3_ on well2_.plate=plate3_.id left outer join screenplatelink screenlink4_ on plate3_.id=screenlink4_.child and
(
1 = ? OR
1 = ? OR
(screenlink4_.group_id in (?)) OR
(screenlink4_.owner_id = ? AND (select (__g.permissions & 1024) = 1024 from experimentergroup __g where __g.id = screenlink4_.group_id)) OR
(screenlink4_.group_id in (?,?) AND (select (__g.permissions & 64) = 64 from experimentergroup __g where __g.id = screenlink4_.group_id)) OR
((select (__g.permissions & 4) = 4 from experimentergroup __g where __g.id = screenlink4_.group_id))
) inner join screen screen5_ on screenlink4_.parent=screen5_.id left outer join imageannotationlink annotation6_ on image0_.id=annotation6_.parent and
(
1 = ? OR
1 = ? OR
(annotation6_.group_id in (?)) OR
(annotation6_.owner_id = ? AND (select (__g.permissions & 1024) = 1024 from experimentergroup __g where __g.id = annotation6_.group_id)) OR
(annotation6_.group_id in (?,?) AND (select (__g.permissions & 64) = 64 from experimentergroup __g where __g.id = annotation6_.group_id)) OR
((select (__g.permissions & 4) = 4 from experimentergroup __g where __g.id = annotation6_.group_id))
) inner join annotation annotation7_ on annotation6_.child=annotation7_.id where
(
1 = ? OR
1 = ? OR
(image0_.group_id in (?)) OR
(image0_.owner_id = ? AND (select (__g.permissions & 1024) = 1024 from experimentergroup __g where __g.id = image0_.group_id)) OR
(image0_.group_id in (?,?) AND (select (__g.permissions & 64) = 64 from experimentergroup __g where __g.id = image0_.group_id)) OR
((select (__g.permissions & 4) = 4 from experimentergroup __g where __g.id = image0_.group_id))
) and screen5_.id=? and annotation7_.textValue=? order by well2_."column", well2_."row"]; ERROR: remaining connection slots are reserved for non-replication superuser connections
Where: parallel worker; nested exception is org.postgresql.util.PSQLException: ERROR: remaining connection slots are reserved for non-replication superuser connections
Where: parallel worker
2024-02-15 09:53:44,648 INFO [ ome.security.basic.BasicEventContext] (.Server-33) cctx: group=-1
2024-02-15 09:53:44,657 INFO [ ome.security.basic.EventHandler] (.Server-33) Auth: user=52,group=-1,event=null(User),sess=b3bff02f-85c2-4591-bbf0-5b73d2f11367
2024-02-15 09:53:44,659 INFO [o.services.sessions.SessionContext$Count] (.Server-23) -Reference count: b3bff02f-85c2-4591-bbf0-5b73d2f11367=15
Started again...
screen -dmS cache parallel --eta --sshloginfile nodes -a ids.txt --results /tmp/cache/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
Status after about 5 mins...
Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
ETA: 4107s Left: 20296 AVG: 0.20s omeroreadonly-1:10/38/3%/8.3s omeroreadonly-2:10/267/19%/1.2s omeroreadonly-3:10/280/19%/1.1s omeroreadonly-4:10/221/15%/1.4s omeroreadwrite:10/597/41%/0.5s
16 hours later... apart from computer:1, percentages seem lower today than yesterday
Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
ETA: 440639s Left: 19178 AVG: 23.45s 1:10/257/10%/230.1s 2:10/471/18%/125.6s 3:10/469/18%/126.1s 4:10/482/19%/122.7s 5:10/842/33%/70.2s
Summary of current progress...
[wmoore@test120-proxy ~]$ ls -alh /tmp/cache/1/ | grep "Feb 14" | wc
1984 17856 121856
[wmoore@test120-proxy ~]$ ls -alh /tmp/cache/1/ | grep "Feb 15" | wc
487 4383 30194
[wmoore@test120-proxy ~]$ ls -alh /tmp/cache/1/ | grep "Feb 16" | wc
102 918 6312
[wmoore@test120-proxy ~]$ ls -alh /tmp/cache/1/ | wc
2574 23159 158373
[wmoore@test120-proxy ~]$ grep ok /tmp/cache/**/**/* | wc
2586 7902 170484
Stopped for more testing
[wmoore@test120-proxy ~]$ screen -X -S 10106.cache quit
Testing successful! Starting again...
screen -dmS cache parallel --eta --sshloginfile nodes -a ids.txt --results /tmp/cache/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
After 4 minutes...
screen -r
Computer:jobs running/jobs completed/%of started jobs
ETA: 2235s Left: 19477 AVG: 0.12s omeroreadonly-1:10/505/22%/0.6s omeroreadonly-2:10/480/21%/0.6s omeroreadonly-3:10/530/23%/0.5s omeroreadonly-4:10/454/20%/0.6s omeroreadwrite:10/253/11%/1.1s
Stopped memo generation during morning IDR meeting.
Found in testing that on idr-testing : omeroreadonly-1
the goofys mount had failed, which likely explains the fact that several memo files have failed to generate.
Re-mounted it and then restarted again...
screen -dmS cache parallel --eta --sshloginfile nodes -a ids.txt --results /tmp/cache/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
On idr0138-pilot, we want to create memos, particularly for idr0090
This lists all Plates for idr0090...
/opt/omero/server/OMERO.server/bin/omero hql --ids-only --style csv 'select link.child.id from ScreenPlateLink as link where link.parent.id=2851'
Can use that in a query to get Images
(venv3) (base) [wmoore@pilot-idr0138-omeroreadwrite scripts]$ /opt/omero/server/OMERO.server/bin/omero hql --limit -1 --ids-only --style csv 'select MIN(field.image.id) FROM WellSample AS field where field.well.plate.id in (select link.child.id from ScreenPlateLink as link where link.parent.id=2851) GROUP BY field.well.plate'
Using session for public@localhost:4064. Idle timeout: 10 min. Current group: Public
#,Col1
0,12544757
1,12550005
2,12551317
3,12546037
4,12550677
5,12553269
6,12554229
7,12547509
8,12554709
9,12541269
10,12553749
11,12545749
12,12549141
13,12552053
14,12554997
15,12539701
16,12552789
17,12543765
18,12548245
19,12542037
20,12546773
21,12543029
(venv3) (base) [wmoore@pilot-idr0138-omeroreadwrite scripts]$ /opt/omero/server/OMERO.server/bin/omero hql --limit -1 --ids-only --style csv 'select MIN(field.image.id) FROM WellSample AS field where field.well.plate.id in (select link.child.id from ScreenPlateLink as link where link.parent.id=2851) GROUP BY field.well.plate' > idr0090_plates.txt
cut -d ',' -f2 idr0090_plates.txt | sed -e 's/^/Image:/' > idr0090_ids.txt
Edit to remove first line Col
screen -S bfcache
(base) (venv3) [wmoore@pilot-idr0138-omeroreadwrite ~]$ source /opt/omero/server/venv3/bin/activate
(venv3) (base) (venv3) [wmoore@pilot-idr0138-omeroreadwrite ~]$ for i in $(cat idr0090_ids.txt); do echo $i && omero render -s localhost -u public -w public test $i --force; done;
On idr-testing, seemed that memo failed for all Plates in idr0090...
Using same idr0090_ids.txt from idr0138 above...
[wmoore@test120-proxy ~]$ for i in $(cat idr0090_ids.txt); do echo $i && grep Error /tmp/cache/1/$i/* ; done
...
Image:12548245
/tmp/cache/1/Image:12548245/stdout:fail: 12548245 0.4982106685638428 exception ::omero::ResourceError
Image:12542037
/tmp/cache/1/Image:12542037/stdout:fail: 12542037 8.83667540550232 exception ::omero::ResourceError
Image:12546773
/tmp/cache/1/Image:12546773/stdout:fail: 12546773 0.15928888320922852 exception ::omero::ResourceError
Image:12543029
/tmp/cache/1/Image:12543029/stdout:fail: 12543029 6.815727233886719 exception ::omero::ResourceError
But I don't have a good way of working out when the goofys mount failed on idr:testing omeroreadonly-1.
Decided to focus just on idr0090 memo file generation on idr-testing too... Cancelled run above, and restarted just with idr0090...
[wmoore@test120-proxy ~]$ screen -dmS cache parallel --eta --sshloginfile nodes -a idr0090_ids.txt --results /tmp/cache/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
(venv3) (base) (venv3) [wmoore@pilot-idr0138-omeroreadwrite ~]$ for i in $(cat idr0090_ids.txt); do echo $i && omero render -s localhost -u public -w public test $i --force; done;
Image:12544757
Using session for public@localhost:4064. Idle timeout: 10 min. Current group: Public
Image:12550005
Using session for public@localhost:4064. Idle timeout: 10 min. Current group: Public
ok: 12550005 5086.527655124664
Image:12551317
Using session for public@localhost:4064. Idle timeout: 10 min. Current group: Public
ok: 12551317 4007.054927587509
Image:12546037
Using session for public@localhost:4064. Idle timeout: 10 min. Current group: Public
ok: 12546037 4222.491681337357
Image:12550677
Using session for public@localhost:4064. Idle timeout: 10 min. Current group: Public
ok: 12550677 3997.5826559066772
Image:12553269
Using session for public@localhost:4064. Idle timeout: 10 min. Current group: Public
ok: 12553269 2755.3615374565125
Image:12554229
Using session for public@localhost:4064. Idle timeout: 10 min. Current group: Public
ok: 12554229 2705.048905849457
Image:12547509
Using session for public@localhost:4064. Idle timeout: 10 min. Current group: Public
ok: 12547509 4176.419909954071
Image:12554709
Using session for public@localhost:4064. Idle timeout: 10 min. Current group: Public
ok: 12554709 2855.7143104076385
Image:12541269
Using session for public@localhost:4064. Idle timeout: 10 min. Current group: Public
ok: 12541269 3.1789915561676025
Image:12553749
Using session for public@localhost:4064. Idle timeout: 10 min. Current group: Public
ok: 12553749 2566.5211091041565
Image:12545749
Using session for public@localhost:4064. Idle timeout: 10 min. Current group: Public
ok: 12545749 1536.8917303085327
Image:12549141
Using session for public@localhost:4064. Idle timeout: 10 min. Current group: Public
ok: 12549141 4395.9468557834625
Image:12552053
Using session for public@localhost:4064. Idle timeout: 10 min. Current group: Public
ok: 12552053 4067.1386001110077
Image:12554997
Using session for public@localhost:4064. Idle timeout: 10 min. Current group: Public
Seb's query for getting ONLY NGFF data for memo file regeneration, using the OME-NGFF
tag...
https://openmicroscopy.slack.com/archives/C0K5WAD8A/p1699433365146299
/opt/omero/server/OMERO.server/bin/omero hql --limit -1 --ids-only --style csv 'SELECT MIN(field.image.id) FROM Screen as s JOIN s.plateLinks as plateLinks JOIN s.annotationLinks as annLinks join annLinks.child as ann join plateLinks.child as p join p.wells as w join w.wellSamples as field where ann.id=38280644 GROUP BY field.well.plate' > ngff_plates.txt
/opt/omero/server/OMERO.server/bin/omero hql --limit -1 --ids-only --style csv 'SELECT d.id FROM Project as p JOIN p.datasetLinks as datasetLinks JOIN p.annotationLinks as annLinks join annLinks.child as ann join datasetLinks.child as d where ann.id=38280644' > ngff_datasets.txt
$ cut -d ',' -f2 ngff_plates.txt | sed -e 's/^/Image:/' > ngff_ids.txt
$ cut -d ',' -f2 ngff_datasets.txt | sed -e 's/^/Dataset:/' >> ngff_ids.txt
screen -dmS cache parallel --eta --sshloginfile nodes -a ngff_ids.txt --results /tmp/ngff_cache/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
[wmoore@test120-proxy ~]$ grep ok: /tmp/ngff_cache/1/**/* | wc
29 87 1991
[wmoore@test120-proxy ~]$ ls /tmp/ngff_cache/1 | wc
132 132 1802
[wmoore@test120-proxy ~]$ grep "Password check failed" /tmp/ngff_cache/1/**/* | wc
24 240 3341
Seeing a bunch of these... "Password check failed" e.g.
[wmoore@test120-proxy ~]$ less /tmp/ngff_cache/1/Image:1484503/stderr
Current status...
[wmoore@test120-proxy ~]$ grep ok: /tmp/ngff_cache/1/**/* | wc
229 687 15984
[wmoore@test120-proxy ~]$
[wmoore@test120-proxy ~]$ ls -alh /tmp/ngff_cache/1/ | wc
377 3386 22877
[wmoore@test120-proxy ~]$ ls -alh /tmp/ngff_cache/1/ | wc
377 3386 22877
[wmoore@test120-proxy ~]$
[wmoore@test120-proxy ~]$ ls -alh /tmp/ngff_cache/1/ | grep "Feb 20 22"
...
drwxrwxr-x. 2 wmoore wmoore 45 Feb 20 22:33 Image:1605339
drwxrwxr-x. 2 wmoore wmoore 45 Feb 20 22:33 Image:1605719
drwxrwxr-x. 2 wmoore wmoore 45 Feb 20 22:33 Image:1606099
drwxrwxr-x. 2 wmoore wmoore 45 Feb 20 22:33 Image:1606479
drwxrwxr-x. 2 wmoore wmoore 45 Feb 20 22:33 Image:1606859
drwxrwxr-x. 2 wmoore wmoore 45 Feb 20 22:33 Image:1607220
drwxrwxr-x. 2 wmoore wmoore 45 Feb 20 22:33 Image:1607600
drwxrwxr-x. 2 wmoore wmoore 45 Feb 20 22:33 Image:1607980
Cancelled this to focus on idr0090 for testing...
screen -dmS cache parallel --eta --sshloginfile nodes -a idr0090_ids.txt --results /tmp/ngff_cache/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
Left idr0090 memo generation running overnight, but it only resulted in 9 out of 22 Plates with "ok".
Ran again this morning...
screen -dmS cache parallel --eta --sshloginfile nodes -a idr0090_ids.txt --results /tmp/memo_idr0090_20240222/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
11/22 are "ok" but the other 11 have some Exception:
[wmoore@test120-proxy ~]$ grep ok: /tmp/memo_idr0090_20240222/1/**/*
/tmp/memo_idr0090_20240222/1/Image:12539701/stdout:ok: 12539701 2.2314953804016113
/tmp/memo_idr0090_20240222/1/Image:12543029/stdout:ok: 12543029 5772.457081079483
/tmp/memo_idr0090_20240222/1/Image:12544757/stdout:ok: 12544757 1.5645971298217773
/tmp/memo_idr0090_20240222/1/Image:12545749/stdout:ok: 12545749 2015.8153846263885
/tmp/memo_idr0090_20240222/1/Image:12546773/stdout:ok: 12546773 2.3995842933654785
/tmp/memo_idr0090_20240222/1/Image:12548245/stdout:ok: 12548245 2.1758549213409424
/tmp/memo_idr0090_20240222/1/Image:12550005/stdout:ok: 12550005 5618.203018188477
/tmp/memo_idr0090_20240222/1/Image:12552789/stdout:ok: 12552789 3435.908084154129
/tmp/memo_idr0090_20240222/1/Image:12553269/stdout:ok: 12553269 2.3640246391296387
/tmp/memo_idr0090_20240222/1/Image:12553749/stdout:ok: 12553749 2.3888068199157715
/tmp/memo_idr0090_20240222/1/Image:12554229/stdout:ok: 12554229 3453.4601786136627
[wmoore@test120-proxy ~]$ grep dException: /tmp/memo_idr0090_20240222/1/**/*
/tmp/memo_idr0090_20240222/1/Image:12541269/stderr:Glacier2.PermissionDeniedException: exception ::Glacier2::PermissionDeniedException
/tmp/memo_idr0090_20240222/1/Image:12542037/stderr:Glacier2.PermissionDeniedException: exception ::Glacier2::PermissionDeniedException
/tmp/memo_idr0090_20240222/1/Image:12543765/stderr: unknown = Network.cpp:2357: Ice::ConnectionRefusedException:
/tmp/memo_idr0090_20240222/1/Image:12546037/stderr:Glacier2.PermissionDeniedException: exception ::Glacier2::PermissionDeniedException
/tmp/memo_idr0090_20240222/1/Image:12547509/stderr: unknown = Network.cpp:2357: Ice::ConnectionRefusedException:
/tmp/memo_idr0090_20240222/1/Image:12549141/stderr: unknown = Network.cpp:2357: Ice::ConnectionRefusedException:
/tmp/memo_idr0090_20240222/1/Image:12550677/stderr: unknown = Network.cpp:2357: Ice::ConnectionRefusedException:
/tmp/memo_idr0090_20240222/1/Image:12551317/stderr: unknown = Network.cpp:2357: Ice::ConnectionRefusedException:
/tmp/memo_idr0090_20240222/1/Image:12552053/stderr:Glacier2.PermissionDeniedException: exception ::Glacier2::PermissionDeniedException
/tmp/memo_idr0090_20240222/1/Image:12554709/stderr:Glacier2.PermissionDeniedException: exception ::Glacier2::PermissionDeniedException
/tmp/memo_idr0090_20240222/1/Image:12554997/stderr:Glacier2.PermissionDeniedException: exception ::Glacier2::PermissionDeniedException
screen -dmS cache parallel --eta --sshloginfile nodes -a idr0090_ids.txt --results /tmp/memo_idr0090_20240222a/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
Manually viewing in web to trigger memo files:
To test memo file generation on some data that takes less time than idr0090, lets get some IDs for idr0011-ScreenB (40 Plates)
/opt/omero/server/OMERO.server/bin/omero hql --limit -1 --ids-only --style csv 'select MIN(field.image.id) FROM WellSample AS field where field.well.plate.id in (select link.child.id from ScreenPlateLink as link where link.parent.id=1551) GROUP BY field.well.plate' > idr0011B_plates.txt
cut -d ',' -f2 idr0011B_plates.txt | sed -e 's/^/Image:/' > idr0011B_ids.txt
screen -dmS cache parallel --eta --sshloginfile nodes -a idr0011B_ids.txt --results /tmp/memo_idr0011B_20240222/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
PermissionDeniedException:
EDIT - completed... 32 / 40 ok, 8 Exceptions...
[wmoore@test120-proxy ~]$ grep ok: /tmp/memo_idr0011B_20240222/1/**/* | wc
32 96 2618
[wmoore@test120-proxy ~]$ grep dExc /tmp/memo_idr0011B_20240222/1/**/*
/tmp/memo_idr0011B_20240222/1/Image:2856051/stderr: unknown = Network.cpp:2357: Ice::ConnectionRefusedException:
/tmp/memo_idr0011B_20240222/1/Image:2856191/stderr: unknown = Network.cpp:2357: Ice::ConnectionRefusedException:
/tmp/memo_idr0011B_20240222/1/Image:2856755/stderr: unknown = Network.cpp:2357: Ice::ConnectionRefusedException:
/tmp/memo_idr0011B_20240222/1/Image:2856795/stderr: unknown = Network.cpp:2357: Ice::ConnectionRefusedException:
/tmp/memo_idr0011B_20240222/1/Image:2856967/stderr: unknown = Network.cpp:2357: Ice::ConnectionRefusedException:
/tmp/memo_idr0011B_20240222/1/Image:2857249/stderr:Glacier2.PermissionDeniedException: exception ::Glacier2::PermissionDeniedException
/tmp/memo_idr0011B_20240222/1/Image:2857670/stderr: unknown = Network.cpp:2357: Ice::ConnectionRefusedException:
/tmp/memo_idr0011B_20240222/1/Image:2857705/stderr: unknown = Network.cpp:2357: Ice::ConnectionRefusedException:
Try running the same omero render
command used in parallel above, but run it manually on each of the servers, using Image ID that failed above from idr0011B
ssh into each in turn...
[wmoore@test120-omeroreadwrite ~]$ /opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test Image:2857249 --force
ok: 2857249 253.97674584388733
[wmoore@test120-omeroreadonly-1 ~]$ /opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test Image:2857249 --force
ok: 2857249 5.177480936050415
First one took longer, then all the readonly servers much faster. All OK. No Errors.
Try running again to see if the same images get errors..
screen -dmS cache parallel --eta --sshloginfile nodes -a idr0011B_ids.txt --results /tmp/memo_idr0011B_20240222a/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
This time all worked!
[wmoore@test120-proxy ~]$ grep ok: /tmp/memo_idr0011B_20240222a/1/**/* | wc
40 120 3325
So, no clues as to why we got exceptions the first time, but shows that re-running can help.
Let's get on and run memo generation on idr0010 (148 Plates)
/opt/omero/server/OMERO.server/bin/omero hql --limit -1 --ids-only --style csv 'select MIN(field.image.id) FROM WellSample AS field where field.well.plate.id in (select link.child.id from ScreenPlateLink as link where link.parent.id=1351) GROUP BY field.well.plate' > idr0010_plates.txt
cut -d ',' -f2 idr0010_plates.txt | sed -e 's/^/Image:/' > idr0010_ids.txt
screen -dmS cache parallel --eta --sshloginfile nodes -a idr0010_ids.txt --results /tmp/memo_idr0010_20240222/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
[wmoore@test120-proxy ~]$ grep ok: /tmp/memo_idr0010_20240222/1/**/* | wc
123 369 9981
[wmoore@test120-proxy ~]$ grep Error /tmp/memo_idr0010_20240222/1/**/* | wc
25 225 4900
All 25 failures are the same e.g:
Re ran for idr0010...
screen -dmS cache parallel --eta --sshloginfile nodes -a idr0010_ids.txt --results /tmp/memo_idr0010_20240222a/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
This time we see 2 different Exceptions:
[wmoore@test120-proxy ~]$ grep Exception: /tmp/memo_idr0010_20240222a/1/**/*
/tmp/memo_idr0010_20240222a/1/Image:1922019/stderr:Glacier2.PermissionDeniedException: exception ::Glacier2::PermissionDeniedException
/tmp/memo_idr0010_20240222a/1/Image:1943685/stderr:omero.WrappedCreateSessionException: exception ::omero::WrappedCreateSessionException
/tmp/memo_idr0010_20240222a/1/Image:3053868/stderr:Glacier2.PermissionDeniedException: exception ::Glacier2::PermissionDeniedException
/tmp/memo_idr0010_20240222a/1/Image:3055788/stderr:omero.WrappedCreateSessionException: exception ::omero::WrappedCreateSessionException
/tmp/memo_idr0010_20240222a/1/Image:3057708/stderr:omero.WrappedCreateSessionException: exception ::omero::WrappedCreateSessionException
/tmp/memo_idr0010_20240222a/1/Image:3059628/stderr:Glacier2.PermissionDeniedException: exception ::Glacier2::PermissionDeniedException
/tmp/memo_idr0010_20240222a/1/Image:3061548/stderr:omero.WrappedCreateSessionException: exception ::omero::WrappedCreateSessionException
/tmp/memo_idr0010_20240222a/1/Image:3063468/stderr:omero.WrappedCreateSessionException: exception ::omero::WrappedCreateSessionException
/tmp/memo_idr0010_20240222a/1/Image:3065437/stderr:omero.WrappedCreateSessionException: exception ::omero::WrappedCreateSessionException
The WrappedCreateSessionException
we saw in the previous run.
The PermissionDeniedException looks like:
Try idr0012...
/opt/omero/server/OMERO.server/bin/omero hql --limit -1 --ids-only --style csv 'select MIN(field.image.id) FROM WellSample AS field where field.well.plate.id in (select link.child.id from ScreenPlateLink as link where link.parent.id=1202) GROUP BY field.well.plate' > idr0012_plates.txt
cut -d ',' -f2 idr0012_plates.txt | sed -e 's/^/Image:/' > idr0012_ids.txt
screen -dmS cache parallel --eta --sshloginfile nodes -a idr0012_ids.txt --results /tmp/memo_idr0012_20240222/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
Logs show idr0012 looks complete (23:10) so I cancelled the screen (still running?).
Started idr0016...
screen -dmS cache parallel --eta --sshloginfile nodes -a idr0016_ids.txt --results /tmp/memo_idr0016_20240222/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
EDIT: > 36 hours later 7:50 on 24th. Still running... (413 plates in total):
[wmoore@test120-proxy ~]$ grep ok: /tmp/memo_idr0016_20240222/1/**/* | wc
267 801 21760
[wmoore@test120-proxy ~]$ grep Error /tmp/memo_idr0016_20240222/1/**/* | wc
5 25 575
Nearly done with idr0016:
[wmoore@test120-proxy ~]$ grep ok: /tmp/memo_idr0016_20240222/1/**/* | wc
353 1059 28778
To keep things running over rest of weekend, cancelled that and ran all of ngff again...
screen -dmS cache parallel --eta --sshloginfile nodes -a ngff_ids.txt --results /tmp/memo_ngff_20240225/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
Since all 5 idr-testing servers are getting constant DatabaseBusyExceptions
, I cancelled memo generation. But the issue is persisting - 15 mins.
Going to restart all servers...
for server in omeroreadwrite omeroreadonly-1 omeroreadonly-2 omeroreadonly-3 omeroreadonly-4; do ssh $server "sudo service omero-server restart"; done
Try again with all ngff...
screen -dmS cache parallel --eta --sshloginfile nodes -a ngff_ids.txt --results /tmp/memo_ngff_20240225/ -j10 '/opt/omero/server/OMERO.server/bin/omero render -s localhost -u public -w public test --force'
/opt/omero/server/OMERO.server/bin/omero hql --limit -1 --ids-only --style csv 'select d.id from Dataset d > datasets.txt where d.id in (select link.child.id from ProjectDatasetLink as link where link.parent.id=301)' > idr0026_datasets.txt
/opt/omero/server/OMERO.server/bin/omero hql --limit -1 --ids-only --style csv 'select MIN(field.image.id) FROM WellSample AS field where field.well.plate.id in (select link.child.id from ScreenPlateLink as link where link.parent.id=1101) GROUP BY field.well.plate' > idr0013A_plates.txt
Following https://github.com/IDR/deployment/blob/master/docs/operating-procedures.md#bio-formats-cache-regeneration with step by step notes...
The instructions there distribute the work across all 5 proxy servers..
Followed instructions exactly as written on
idr-testing
until:We can probably skip this step if we use the one shared python environment at
/opt/omero/server/venv3
for subsequent steps...Checking we have same jars on all servers... e.g. ZarrReader...
Deletion has been running for an hour and a half now: