While looking into #1872, I noticed that landing_zone_move uses the old get_subcoll_obj_paths() helper originally added (IIRC) in the old sodar_taskflow service.
This method of iterating through subcollections is very inefficient and it is the reason why we use admin SQL queries instead of walk() in IrodsAPI.
The use of this should be replaced by appropriate calls to IrodsAPI.get_objects(). The helper can then be removed for good.
I'm not sure if this is the root cause for #1872. That seems to be caused by an iRODS timeout, but would that happen if we spent a long time traversing Collection.subcollection:s? In any case, doing this should at the very least speed up landing zone jobs for zones with a lot of subcollections and get us rid of an inefficient helper.
While looking into #1872, I noticed that
landing_zone_move
uses the oldget_subcoll_obj_paths()
helper originally added (IIRC) in the oldsodar_taskflow
service.This method of iterating through subcollections is very inefficient and it is the reason why we use admin SQL queries instead of
walk()
inIrodsAPI
.The use of this should be replaced by appropriate calls to
IrodsAPI.get_objects()
. The helper can then be removed for good.I'm not sure if this is the root cause for #1872. That seems to be caused by an iRODS timeout, but would that happen if we spent a long time traversing
Collection.subcollection
:s? In any case, doing this should at the very least speed up landing zone jobs for zones with a lot of subcollections and get us rid of an inefficient helper.