microsoftgraph / msgraph-sdk-python

MIT License
356 stars 48 forks source link

Serialization Documentation & Functionality #200

Open danieldjewell opened 1 year ago

danieldjewell commented 1 year ago

Serialization appears to be pretty broken... One of the biggest overarching issues, from my perspective, with the entire project, is support for easy serialization and thus integration with the rest of the Python ecosystem. While I understand the benefits of using Kiota to auto-generate a majority of the code (the Graph API has a truly huge number of endpoints/functions/methods/etc.) ... the lack of "pythonic" support for serialization is a problem.

Serializing a UserCollectionResponse

As a specific bug, consider:

import kiota_serialization_json 

# "client" is already instantiated and is a "msgraph.graph_service_client.GraphServiceClient" object
users = await client.users.get() 
# type(users) == msgraph.generated.models.user_collection_response.UserCollectionResponse
kjsonfactory = kiota_serialization_json.json_serialization_writer_factory.JsonSerializationWriterFactory()
kjsonwriter = kjsonfactory.get_serialization_writer(kjsonfactory.get_valid_content_type())
users.serialize(kjsonwriter)
### Fails with: 
# AttributeError: 'bytes' object has no attribute 'serialize'
## Specifically... 

# File /usr/lib/python3.10/site-packages/msgraph/generated/models/device_key.py:142, in DeviceKey.serialize(self, writer)
#    140     raise Exception("writer cannot be undefined")
#    141 writer.write_uuid_value("deviceId", self.device_id)
#--> 142 writer.write_object_value("keyMaterial", self.key_material)
#    143 writer.write_str_value("keyType", self.key_type)
#    144 writer.write_str_value("@odata.type", self.odata_type)

At a minimum, the above should fail gracefully (either by default, or with a parameter ... errors='ignore' - when unable to convert something, a dummy value should be returned).

Pythonic Design

Python has some rather convenient language patterns - notably: reasonably straightforward and predictable "type conversion" (maybe "deconstruction" or "serialization" are appropriate terms too). An example: If one has a Pandas DataFrame (pandas.DataFrame) called df, one can run dict(df) and/or list(df) in addition to instance methods for conversion (df.to_dict(), df.to_csv(), df.to_parquet(), df.to_json(), etc.)

As a general design observation, the current implementation does not implement critical aspects of the Python Data Model - including the repr method, which is designed to allow objects to return a "viewable" output suitable for use in the Python REPL/IPython/Jupyter...

As a specific example, the UserCollectionResponse object does not support simple "serialization" either by being able to call dict(users) or users.to_dict() / users.to_list()... The apparent requirement to import yet another and different Python module (kiota_serialization_json) just to be able to attempt conversion to JSON is a little convoluted. This should be as simple as calling users.to_json() (with a default, pre-instantiated writer - one that could be overridden if required... i.e. users.to_json(custom_writer). Not to mention the fact that calling serialize() doesn't actually return any data ... one must call the JsonSerializationWriter.get_serialized_content() method to actually get the data......

Interoperability

As it stands now, the lack of "easy" conversion is a major impediment to actually using the SDK. While the theory of the SDK is great, it's frankly much easier to just make a raw HTTP request to https://graph.microsoft.com/beta/users and parse the JSON response myself. I can get the data I need into a Pandas DataFrame with 3 or 4 lines of code.... Given the number of properties of a user, it is NOT an option to access the properties of the user manually as the samples suggest..... user.id, user.user_principal_name etc.

Converting a collection of objects to a Pandas DataFrame should (and could, if implemented properly!) be as easy as calling pd.DataFrame.from_records(users) on the UserCollectionResponse object (or any other collection). Even having a .to_list() method for UserCollectionResponse which returns a "serialized" list (i.e. a list of dicts that only contain basic Python types) would be a vast improvement as it would allow one to call pd.DataFrame.from_records(users.to_list())...

Dependencies & Documentation

I understand the need for dependencies and the strong desire for code re-use. However, the documentation is severely lacking on explaining the reliance/use of Kiota. I couldn't find any documentation or examples indicating how to even properly use the UserCollectionResponse.serialize() method and, since it's not implemented in a Pythonic way, it took a significant amount of time to research how to even use it. (The fact that the repository for the kiota_serialization_json module is in a completely different GitHub organization doesn't help. A simple readme/reference with direct links to the repositories of all dependencies would be of great assistance...)

EDIT: I don't mean to sound overly critical here - I have had and continue to have deep respect for Microsoft and its products. At the same time, I suppose the respect that Microsoft has rightfully earned from me also comes with high expectations - expectations that things will work and work well. In the past, Microsoft and its products have mostly met or exceeded expectations. However, in recent years, that has changed... And I suppose it's a little frustrating and disappointing. It seems to me as if there has been a shift to "quantity" over "quality"......

JohnAGEnoda commented 7 months ago

Agree - many of the benefits of the python SDK slip away if you have to wrestle with Kiota

@danieldjewell How did you get UserCollectionResponse.serialize() to work?

copdips commented 6 months ago

hello, for an interoperability issue, could you please explain how to in a smart way convert to dict by keeping the origial API reposnse camel case, current __dict__ makes the dict keys in snake case.

Ideally, a param during quering to keep the original cases directly.