labscript-suite-temp-2 / lyse

lyse is an analysis framework. It coordinates the running of python analysis scripts on experiment data as it becomes availiable, updating plots in real time.
BSD 2-Clause "Simplified" License
0 stars 0 forks source link

Dataframe stopped updating after multishot routine error #45

Closed philipstarkey closed 4 years ago

philipstarkey commented 5 years ago

Original report (archived issue) by Shaun Johnstone (Bitbucket: shjohnst, GitHub: shjohnst).


I'm not sure how to reproduce this, however I just encountered a situation where single shot routines were saving new results to the HDF5 files, but the dataframe in lyse was not updating.

After a multishot routine had encountered an error, the analysis queue was paused, and the remaining experiments in my sequence came through and were waiting to be analysed. When I unpaused the queue, they were analysed, however the values in the main lyse table did not update, and multishot routines could not access them.

To resolve the issue, I deleted one of the files, and re-added it to lyse (at which point it had the values displayed correctly without having to run the analysis again). After this, running the analysis routines on the remaining shots that had not updated seemed to work.

philipstarkey commented 5 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


Seems pretty odd. Looking at the code that communicates the updated data back to the dataframe, and the code that updates the dataframe, nothing jumps out to me that would explain the lack up updating given that surrounding code seems to be running and not crashing. I'll keep an eye out for it though, if anyone figures out how to reproduce it we can debug it properly.

philipstarkey commented 5 years ago

Original comment by Philip Starkey (Bitbucket: pstarkey, GitHub: philipstarkey).


I don't suppose you have a log file covering the time of the error?

philipstarkey commented 5 years ago

Original comment by Shaun Johnstone (Bitbucket: shjohnst, GitHub: shjohnst).


I’ve just encountered something like this again… not sure if there was any crashed routine to trigger it this time, but I just noticed that re-running single-shot scripts with changed code/fit parameters would save new results to the HDF files, but not update the lyse dataframe.

As far as I know I’m running the latest version of everything, on Python 3 (the original issue report was on the old apparatus on 2.7).

philipstarkey commented 5 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


I've seen this too, and have had another independent report at NIST, so it seems to be pretty common. Perhaps whatever versions of things are required to trigger it are more common now (despite the 2.7 to 3.x change). Anyhow since I'm seeing it myself occasionally I'll try to reproduce it and investigate (once back from holidays, anyway).

philipstarkey commented 5 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


I have not been able to reproduce this, but I have noticed some issues in lyse's updating of rows in the Qt model. Firstly, the code that was supposed to be updating rows was not updating them at all, because checking if the column is in the list of things that need updating was not taking into account that the column names are tuples of arbitrary length with empty strings padded. So the following (line ~1447 in update_row()):

        # Update the data in the Qt model:
        dataframe_row = self.dataframe.iloc[row_number].to_dict()
        for column_number, column_name in self.column_names.items():
            if not isinstance(column_name, tuple):
                # One of our special columns, does not correspond to a column in the dataframe:
                continue
            if updated_row_data is not None and column_name not in updated_row_data:
                continue

was always continueing and not updating any Qt items. So it needed to be changed to:

-            if updated_row_data is not None and column_name not in updated_row_data:
-                continue
+            if updated_row_data is not None:
+                if column_name[:column_name.index('')] not in updated_row_data:
+                    continue

since the keys of updated_row_data do not have padded empty strings.

So the Qt model was never being updated when we thought it was! Instead, it was being updated in a later call that was intended to only update the percentage status of the row, but actually updated every Qt model item, defeating the purpose of intentionally trying to only update the ones that changed.

So I've now split off the updating of the progress percentage to a different function (such that that call does not touch any parts of the Qt model other than the progress percentage), and fixed the check for whether a model item should be updated such that the other calls actually do update it.

I still do not see what caused the original issue, so I do not have a great reason to think this will fix it. But, since the code was not doing the intended thing before (even though the result should still have worked) and it is doing the intended thing now, there is some chance maybe it will fix the issue, whatever it was.

This attempted fix is in PR #68.

If the problem occurs again, could you check whether the value in the lyse GUI agrees with the value in the dataframe? You can grab the lyse dataframe from a separate Python interpreter with import lyse; df = lyse.data() to inspect it without disturbing lyse.

This could help narrow it down to whether it is the GUI part that has a problem, or something before then.

philipstarkey commented 4 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


An attempt to fix issue #45.

Although I don't see why the Qt model or dataframe might sometimes not update, I noticed that the calls to update_row with updated_row_data provided were not actually updating the Qt model, because the check for whether a column was in the updated_row_data dictionary was comparing a column tuple padded with empty strings to one not padded with empty strings. Further, the call to update the progress percentage was updating all qt model items for the row.

Now, the two things are fixed - the updated_row_data does actually update the qt model items, and the update of the status percentage does not.

There is no reason to believe this will fix issue #45 other than it touching code that could be relevant and was not working as intended. I still don't see specifically what could cause issue #45.

Also, the updated data dict sent back to lyse from analysis routines should only include items under the 'results/' group in the HDF5 file. So this fixes that too.

→ \<\<cset 243198937969ba47bf4f55adacc1be66a3e307fa>>

philipstarkey commented 4 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


An attempt to fix issue #45.

Although I don't see why the Qt model or dataframe might sometimes not update, I noticed that the calls to update_row with updated_row_data provided were not actually updating the Qt model, because the check for whether a column was in the updated_row_data dictionary was comparing a column tuple padded with empty strings to one not padded with empty strings. Further, the call to update the progress percentage was updating all qt model items for the row.

Now, the two things are fixed - the updated_row_data does actually update the qt model items, and the update of the status percentage does not.

There is no reason to believe this will fix issue #45 other than it touching code that could be relevant and was not working as intended. I still don't see specifically what could cause issue #45.

Also, the updated data dict sent back to lyse from analysis routines should only include items under the 'results/' group in the HDF5 file. So this fixes that too.

→ \<\<cset 243198937969ba47bf4f55adacc1be66a3e307fa>>

philipstarkey commented 4 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


Merged in attempted-bugfix (pull request #68)

An attempt to fix issue #45.

→ \<\<cset f8478c575cb607f3fe320f1f5866fcf44bcd031d>>