Princeton-LSI-ResearchComputing / tracebase

Mouse Metabolite Tracing Data Repository for the Rabinowitz Lab
MIT License
4 stars 1 forks source link

Gracefully handle stale maintained `Animal.last_serum_sample` maintained field #986

Closed hepcat72 closed 3 months ago

hepcat72 commented 4 months ago

Summary Change Description

Caught a situation where the Animal.last_serum_sample maintained field value can be invalid due to manual data manipulation and added an error status with a message suggesting that maintained fields be rebuilt.

While I was at it, I addressed a TODO item in the same serum_validity code:

TODO: MSRunSequence.date can no longer be null, so these warnings can probably be removed

An example of the exception previously encountered is in the issue's current behavior section. This is now what the user gets from this PR when the maintained field value is incorrect:

stalemtf

This is what caused the exception to occur: The last_serum_sample should be on dev, which is still what it is in my sandbox:

In [1]: from DataRepo.models import Animal

In [2]: anml = Animal.objects.get(name="20220818M1")

In [3]: anml._last_serum_sample()

Out[3]: <Sample: 20220818_M1_mix1_T150>

In [4]: anml.last_serum_sample
Out[4]: <Sample: 20220818_M1_mix1_T150>

This is what it became on dev:

In [1]: from DataRepo.models import Animal

In [2]: anml = Animal.objects.get(name="20220818M1")
Adding propagation handler to Animal.studies.through

In [3]: anml._last_serum_sample()
Out[3]: <Sample: 20220818_M1_mix1_T150>

In [4]: anml.last_serum_sample

I believe that manual database manipulations caused the value to become None. As you can see, the method that generates the value, still generates the correct value. It's just that the value saved got wiped out.

Just calling:

In [4]: anml.save()

automatically fixes the maintained fields:

mntfldfixd

This PR doesn't fix the values, it just makes it fail gracefully and tell us what needs to be done to fix the problem (in the tooltip).

Note, I should have started a screen session yesterday when I ran rebuild_maintained_fields, because my session went stale and the rebuild was rolled back. I'm going to take a quick look at the script. I think it has a surgical option to make it only apply to certain labeled autoupdate routes. I may also reintroduce the ability to suspend cache updates. I'd previously eliminated that to make it better encapsulated, but I can add it to the management command...

Affected Issues/Pull Requests

Review Notes

See comments in-line.

Checklist

This pull request will be merged once the following requirements are met. The author and/or reviewers should uncheck any unmet requirements:

hepcat72 commented 4 months ago

Just as a side-note, I created a separate branch to run the rebuild without caching retrievals or updates and I ran it again, this time using the --labels option to only perform auto-updates of maintained fields with the fcirc_calcs label:

python manage.py rebuild_maintained_fields --labels fcirc_calcs

...and it too only 4 minutes. It updated all the other animals' values as well as fields in other models with the same label, totaling 6,137 field value updates. They appear to all have been rendered stale, looking at the console output.

I then updated all fields:

python manage.py rebuild_maintained_fields

and it updated 6,310 fields, still in about 4 minutes. I didn't realize that we only use 1 other label: "name" and it's just for the infusate updates, which explains why so few more fields were updated.

So it seems that the cache suspension REALLY speeds up the rebuild_maintained_fields script. Plus, you don't get deluged with cache messages (aside from the ones for class registration). I'll be submitting a PR shortly.