fmi-basel / gliberal-scMultipleX

Feature extraction and linking of multiplexing data.
BSD 3-Clause "New" or "Revised" License
8 stars 0 forks source link

Measurement bug in 3D Fractal measurements #79

Closed jluethi closed 1 year ago

jluethi commented 1 year ago

In the latest update, something with how we're building the dataframes in the Fractal task has broken, leading to this error when we try to do a 3D measurement:

Traceback (most recent call last):
  File "/tungstenfs/landing/fractal/1016-main-cluster-postgres/FRACTAL_TASKS_DIR/.fractal/scmultiplex0.4.dev44+g36bca79/venv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3802, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 165, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 5745, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 5753, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'label'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/tungstenfs/landing/fractal/1016-main-cluster-postgres/FRACTAL_TASKS_DIR/.fractal/scmultiplex0.4.dev44+g36bca79/venv/lib/python3.9/site-packages/scmultiplex/fractal/scmultiplex_feature_measurements.py", line 371, in <module>
    run_fractal_task(
  File "/tungstenfs/landing/fractal/1016-main-cluster-postgres/FRACTAL_TASKS_DIR/.fractal/scmultiplex0.4.dev44+g36bca79/venv/lib/python3.9/site-packages/fractal_tasks_core/_utils.py", line 91, in run_fractal_task
    metadata_update = task_function(**task_args.dict(exclude_unset=True))
  File "/tungstenfs/landing/fractal/1016-main-cluster-postgres/FRACTAL_TASKS_DIR/.fractal/scmultiplex0.4.dev44+g36bca79/venv/lib/python3.9/site-packages/scmultiplex/fractal/scmultiplex_feature_measurements.py", line 318, in scmultiplex_measurements
    if not (df_well["label"] == df_info_well["label"]).all():
  File "/tungstenfs/landing/fractal/1016-main-cluster-postgres/FRACTAL_TASKS_DIR/.fractal/scmultiplex0.4.dev44+g36bca79/venv/lib/python3.9/site-packages/pandas/core/frame.py", line 3807, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/tungstenfs/landing/fractal/1016-main-cluster-postgres/FRACTAL_TASKS_DIR/.fractal/scmultiplex0.4.dev44+g36bca79/venv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3804, in get_loc
    raise KeyError(key) from err
KeyError: 'label'

My suspicion is that we're merging dataframes in a way where they had 2 label columns and those then got renamed to label_1 & label_2 or something like that.

I will dig into it and will try to add some automated testing for the fractal tasks, such that we'd spot such issues earlier.

jluethi commented 1 year ago

Hmm, funny. I now built 2D & 3D tests on a small OME-Zarr file (see https://github.com/fmi-basel/gliberal-scMultipleX/tree/fractal-testing). But all the tests pass fine, I can't reproduce the bug from Clara above. Need to investigate a bit further.

Next steps:

  1. Get the example data from Clara locally and test there
  2. Verify that the Fractal task on the FMI Fractal server is actually up-to-date
jluethi commented 1 year ago

After all the digging and the adding of tests: The issue was that the Fractal task didn't handle an empty image correctly. Now also being added with #81 . I'll run the tests on the cluster to see that the workflow really works