hpcflow / hpcflow-new

Mozilla Public License 2.0
0 stars 5 forks source link

MessagePack does not support arbitrarily large integers #692

Open aplowman opened 3 months ago

aplowman commented 3 months ago

When using the Zarr persistent store, we encode the run metadata using MessagePack. One part of this metadata is a "directory snapshot", which is essentially data from os.stat. I have been getting: OverflowError: int too big to convert when hpcflow is trying to save the run metadata on Windows with Python 3.12, when using a workflow directory within a DevDrive. This doesn't happen with Python 3.11 on DevDrive or with Python 3.12 not on a DevDrive.

This is because on Windows Python 3.12 on a DevDrive, for whatever reason, some of the integers in an os.stat_result are very large; and MessagePack does not support arbitrarily large integers.

The solution to this is encode the os_stat_result data as a list of strings, before passing to MessagePack. This is presumably less space efficient. However, I am considering removing the "directory snapshot" feature, because for very large workflows this metadata would become disproportionately large, and arguably there is not actually a use case for it currently.