logix-project / logix

AI Logging for Interpretability and Explainability🔬
Apache License 2.0
74 stars 6 forks source link

Metadata Diet #87

Closed eatpk closed 7 months ago

eatpk commented 7 months ago

An execution example of metadata elemet looks like such:

{
'data_id': 'ff0ffa2564b9bad940d3ce79e8c483d91e8f071f57a0908388ab72b3a0e1c329', 
'path': [['1', 'grad'], ['3', 'grad'], ['5', 'grad']], 
'offset': 10886993920, 
'shape': [[512, 784], [256, 512], [10, 256]],
 'block_size': 535040, 'dtype': 'float32'
}
sangkeun00 commented 7 months ago

Overall, LGTM. A few small questions are:

Have you measure the change in the size of metadata before/after this PR? Since it's called "diet", I assume this reduces the metadata size? Also, if you see any changes in the data loading speed (even if it's not significant, like 5-10%) by changing the metadata structure, can you report it too?

eatpk commented 7 months ago

Oh yes, for the speed, there was no difference.. I ran 10 times The average of before diet was logging time: 29.70412812 sec, Computation time: 15.85226355 sec The average time for after diet was logging time: 28.76126018 sec, computation time: 15.23824615 sec

For the size, it was 4.3MB before and now it is 2.8MB for 6000 MNIST with 3 module paths, about 35% reduction.