SCIInstitute / ShapeWorks

ShapeWorks
http://sciinstitute.github.io/ShapeWorks/
Other
100 stars 32 forks source link

Update PCA_Embedder Saving and Loading #2204

Open acreegan opened 4 months ago

acreegan commented 4 months ago

The main goal of this update was to be able to save a PCA model to disk, then load it again later without the need to re-run the PCA analysis from the original data. I wanted to do this within a python program importing ShapeWorks as a library. Of the two sets of PCA functionality in the ShapeWorks repository, the PCA_Embedder class from the pure python DataAugmentationUtils module was closest to having these features, and easiest to extend in python, so this update extends that class.

Changes made:

akenmorris commented 3 months ago

I'm getting this error with the deep_ssm use case (automated test)

2024-03-29T00:18:36.3395229Z 8:   File "/__w/ShapeWorks/ShapeWorks/Examples/Python/RunUseCase.py", line 97, in <module>
2024-03-29T00:18:36.3396061Z 8:     module.Run_Pipeline(args)
2024-03-29T00:18:36.3397041Z 8:   File "/__w/ShapeWorks/ShapeWorks/Examples/Python/deep_ssm.py", line 257, in Run_Pipeline
2024-03-29T00:18:36.3398394Z 8:     embedded_dim = DeepSSMUtils.run_data_augmentation(project, num_samples, num_dim, percent_variability, sampler,
2024-03-29T00:18:36.3399786Z 8:   File "/__w/ShapeWorks/ShapeWorks/Python/DeepSSMUtilsPackage/DeepSSMUtils/run_utils.py", line 289, in run_data_augmentation
2024-03-29T00:18:36.3401098Z 8:     embedded_dim = DataAugmentationUtils.runDataAugmentation(aug_dir, train_image_filenames,
2024-03-29T00:18:36.3402499Z 8:   File "/__w/ShapeWorks/ShapeWorks/Python/DataAugmentationUtilsPackage/DataAugmentationUtils/__init__.py", line 22, in runDataAugmentation
2024-03-29T00:18:36.3404177Z 8:     num_dim = DataAugmentation.point_based_aug(out_dir, img_list, world_point_list, num_samples, num_dim, percent_variability, sampler_type, mixture_num, processes)
2024-03-29T00:18:36.3405906Z 8:   File "/__w/ShapeWorks/ShapeWorks/Python/DataAugmentationUtilsPackage/DataAugmentationUtils/DataAugmentation.py", line 37, in point_based_aug
2024-03-29T00:18:36.3407071Z 8:     num_dim = PointEmbedder.num_dim
2024-03-29T00:18:36.3407902Z 8: AttributeError: 'PCA_Embbeder' object has no attribute 'num_dim'
akenmorris commented 3 months ago

@acreegan , I've fixed those errors, but the new pca embedder test fails on Mac and Windows. I assume due to a precision/rounding difference. I'll take a look at it again when I have a chance.