Open chris-allan opened 2 weeks ago
Example when used on a small table:
In [1]: sr = client.getSession().sharedResources()
...: a = sr.openTable(omero.model.OriginalFileI(24965, False))
In [2]: a.slice([0], [3, 0], {"omero.tables.include_row_numbers": "true"})
Out[2]:
object #0 (::omero::grid::Data)
{
lastModification = 1718797774346
rowNumbers =
{
[0] = 3
[1] = 0
}
columns =
{
[0] = object #1 (::omero::grid::StringColumn)
{
name = ImageName
description =
size = 53
values =
{
[0] = siControl_N20_Cep215_I_20110411_Mon-1509_0_SIR_PRJ.dv
[1] = Centrin_PCNT_Cep215_20110506_Fri-1608_0_SIR_PRJ.dv
}
}
}
}
In [3]: a.slice([0], [3, 0], {"omero.tables.include_row_numbers": "false"})
Out[3]:
object #0 (::omero::grid::Data)
{
lastModification = 1718797774346
rowNumbers =
{
}
columns =
{
[0] = object #1 (::omero::grid::StringColumn)
{
name = ImageName
description =
size = 53
values =
{
[0] = siControl_N20_Cep215_I_20110411_Mon-1509_0_SIR_PRJ.dv
[1] = Centrin_PCNT_Cep215_20110506_Fri-1608_0_SIR_PRJ.dv
}
}
}
}
If we're happy with the implementation I'll make separate PRs to add integration tests like we have for the bitmask query and update the main OMERO.tables documentation detailing the feature.
👍
I'm surprised that the row numbers are longer than other columns except perhaps bools 😏 but I can definitely see how they would effectively double the overhead.
Also true for short string columns. When it comes to memory usage in Python in particular, also true where the same numbers or strings repeat. These are both common in a lot of the data analysis outputs we're exposed to.
Integration test added in ome/openmicroscopy#6396.
Documentation added in ome/omero-documentation#2441.
For calls to readCoordinates, read, and slice the returned value order in the
Data
response is the same as requested. While having the row numbers included in the response is convenient, when the number of cells being returned is high this incurs memory and serialization overhead. This is especially true when retrieving a small number of columns for a large number of rows; in this caserowNumbers
can actually be more expensive to include than the data itself.Here we use the Ice context and "omero.tables.include_row_numbers" to additively affect the client API without changing any of the Ice method prototypes.
/cc @erindiel, @kkoz, @DavidStirling, @emilroz