Ensure `_load_FeatureCollection` correctly assigns `filenames` attribute

cpalfonso commented 4 weeks ago

This PR resolves #185.

The code snippet from that issue:

import pickle
from gplately import PlateReconstruction
from plate_model_manager import PlateModelManager

pmm = PlateModelManager()
model = pmm.get_model("Muller2019")
plate_reconstruction = PlateReconstruction(
    rotation_model=model.get_rotation_model(),
    topology_features=model.get_layer("Topologies"),
)
pickled = pickle.loads(pickle.dumps(plate_reconstruction))
print(f"Original num. features: {len(plate_reconstruction.topology_features)}")
print(f"Pickled num. features: {len(pickled.topology_features)}")

now produces the correct output:

Original num. features: 7941
Pickled num. features: 7941

brmather commented 2 weeks ago

This still does not work for me. For instance,

rotation_model = pygplates.RotationModel("Data/Model/CombinedRotations.rot")
topologies = pygplates.FeatureCollection("Data/Model/Feature_Geometries.gpml")
topologies.add(pygplates.FeatureCollection("Data/Model/Plate_Boundaries.gpml"))

static_polygons = pygplates.FeatureCollection("Data/Model/StaticGeometries/StaticPolygons/Global_EarthByte_GPlates_PresentDay_StaticPlatePolygons.shp")
coastlines = pygplates.FeatureCollection('Data/Model/StaticGeometries/Coastlines/Global_coastlines_low_res.shp')
continents = pygplates.FeatureCollection('Data/Model/StaticGeometries/ContinentalPolygons/Global_EarthByte_GPlates_PresentDay_ContinentsOnly.shp')
COBs = pygplates.FeatureCollection('Data/Model/StaticGeometries/COBLineSegments/Global_EarthByte_GeeK07_COBLineSegments_2019_v1.shp')

model = gplately.PlateReconstruction(rotation_model, topologies, static_polygons)
gplot = gplately.PlotTopologies(model, coastlines, continents, COBs)

still does not return any filenames in FeatureCollection objects (coastlines, continents, COBs, etc.) RotationModel works fine, however.

brmather commented 2 weeks ago

This was working correctly before #173 (ahem @michaelchin !!)

We should implement some tests to make sure self.filenames are not empty when loading from files. Otherwise the whole parallelism falls over.

michaelchin commented 2 weeks ago

This was working correctly before #173 (ahem @michaelchin !!)

We should implement some tests to make sure self.filenames are not empty when loading from files. Otherwise the whole parallelism falls over.

I had created a unittest file before.

https://github.com/GPlates/gplately/blob/master/tests-dir/unittest/test_pygplates_wrapper.py

Maybe we can modify and incorporate this unit test file in the automated Pytest cases.

The reason that the features were not pickled might be because the setstate and getstate functions were not implemented.

I am not entirely sure about the missing filenames. It might due to the missing of keyword filenames when calling the functions.

@cpalfonso do you want to me to merge this PR or you prefer doing it yourself?

michaelchin commented 2 weeks ago

It is this PR a draft or ready to merge?

michaelchin commented 2 weeks ago

See https://github.com/GPlates/gplately/commit/5f3f9aaeb00b1cb0af9b38c8431d72ea4436e33f#diff-6b64fd54618a63afc3873c4dfacd6362a8bb951febf8dc1e25950071eea60255R245-R248

I have added the code to track the file name which is passed as features parameter when creating FeatureCollection object.

cpalfonso commented 2 weeks ago

@michaelchin I can merge this now.

brmather commented 2 weeks ago

Hi @michaelchin,

Maybe we can modify and incorporate this unit test file in the automated Pytest cases.

You should absolutely add these tests to the automated testing. Is there any reason why all these tests are not run automatically using CI?

The reason that the features were not pickled might be because the setstate and getstate functions were not implemented.

I'm just using pygplates.FeatureCollection at this stage. There are still problems:

It looks like filenames has become a global variable, appending multiple filenames from other FeatureCollections. Execute the same cell and the filenames are appended again.
The FeatureCollection is not properly loaded when you specify the filename using the filenames keyword. In my example above, if you set gplot.time=0 and access the gplot.COBs attribute, you get a None.

As far as I can tell this object is completely broken. You should build and example on your machine and test a fix for this thoroughly. I say this because all of the workflows which make use of multiprocessing rely on this working. This is high priority stuff!

michaelchin commented 2 weeks ago

Hi @michaelchin,

Maybe we can modify and incorporate this unit test file in the automated Pytest cases.

You should absolutely add these tests to the automated testing. Is there any reason why all these tests are not run automatically using CI?

The reason that the features were not pickled might be because the setstate and getstate functions were not implemented.

I'm just using pygplates.FeatureCollection at this stage. There are still problems:

It looks like filenames has become a global variable, appending multiple filenames from other FeatureCollections. Execute the same cell and the filenames are appended again.

The FeatureCollection is not properly loaded when you specify the filename using the filenames keyword. In my example above, if you set gplot.time=0 and access the gplot.COBs attribute, you get a None.

As far as I can tell this object is completely broken. You should build and example on your machine and test a fix for this thoroughly. I say this because all of the workflows which make use of multiprocessing rely on this working. This is high priority stuff!

I think we should roll back to 9bb69306ae2c18f7fbbfe163b8537011a32b8871 for this file for now and create a new branch to track the new changes.

michaelchin commented 2 weeks ago

You should absolutely add these tests to the automated testing. Is there any reason why all these tests are not run automatically using CI?

They are meant to be used by me during development. There is no reason to add all of them to the automated testing. Only parts of them should be selected and incorporated to the automated testing.

GPlates / gplately

Ensure `_load_FeatureCollection` correctly assigns `filenames` attribute #186