Open andrewbanchich opened 1 year ago
It would be really nice if Smithy codegen could use T: Serialize
for input types and T: Deserialize
for output types instead of concrete values so that users could choose.
@thomas-k-cameron has done a lot of work to add Serialize
/Deserialize
support, and there are a number of PRs open to add it to other types in aws-smithy-types
(#2645, #2646, #2647).
The same pattern can be followed to also add them to Document
. It just needs to be behind #[cfg_attr(aws-sdk-unstable, feature = "serialize")]
or #[cfg_attr(aws-sdk-unstable, feature = "deserialize")]
conditional compilation flags for now.
We won't get to this any time soon, but you're welcome to contribute and work with @thomas-k-cameron to get this done.
To clarify, this is more about the Document
type using a HashMap
instead of a serde_json::Map
. Even if Document
did impl Serialize
/ Deserialize
, many users would still need to converted a Document
to a Value
and back. This conversion is trivial for other variants of documents because their types line up with the same inner types of serde_json::Value
s, so one can just unwrap them from Document
and wrap them in Value
. Document::Object
wouldn't be able to do that though.
serde_json::Map
uses BTreeMap
or IndexMap
and this is selected by conditional compilation and they used to use other implementation (e.g. LinkedHashMap
).
Since they are not making any guarantee, I don't know if it would be a good idea.
Once RFC30 is implemented you should be able to do something like this.
let doc = aws_smithy_types::Document::Object(HashMap::new());
let serde_json_value = serde_json::to_value(doc).unwrap();
let target_datatype = serde_json::from_value(serde_json_value).unwrap();
If your concern is for the UX then I think we could add a function like this.
impl Document {
pub fn to_serde_json(&self) -> serde_json::Value {
// convert to Value
}
}
My concern isn't about implementing Serialize
or Deserialize
, and it isn't the API for doing so.
I'm saying that 99.9% of the Rust ecosystem uses serde_json::Value
for representing JSON, not Smithy's Document
type. For inputs, users will have API handlers which accept a Document
and output a Document
, but in between the input and output, basically everyone will need to convert this type manually to serde_json::Value
.
Converting between map types is not trivial. Users will need to pay a performance cost for arbitrarily large map conversions.
serde_json
is flexible enough to allow the user to choose what type they want to deserialize to; if I want a HashMap
then I can have it deserialize to a HashMap
. If I want a serde_json::Map
, that works too.
I'm saying IMO forcing one map type onto all users is not ideal. It would be nice to allow users to specify the type like serde_json
does.
I guess my concern also applies to Document::Array
- is Smithy going to make users manually map over arrays and call some custom convert_to_x
function?
Other than Array
and Document
, the other variant types are all standard types, so can easily be used throughout the ecosystem. However, Vec
and HashMap
both use generics, which is why I believe they are problematic, and that you should pass the control of the concrete type through to the user.
I understand the concern, but we don't want to depend on any serde
libraries out of box. I could be persuaded to represent Document
with serde_json::Value
behind the scenes #[cfg_attr(aws-sdk-unstable)]
switch though, and also add to/from value conversions.
Our concern is with maintainability. The SDK will have to be maintained without breaking changes for many years, so exposing separate libraries in the public API is a risk.
aws-sdk-unstable
sounds great to me! Thank you.
Any thoughts on handling this with a generic and allowing the user to specify the concrete type? I'd think that would be the best of both worlds.
Every data type that uses Document would need to accept generic parameters, so I'm pretty sure you would need to make a lot of changes to the code gen logic.
Also, I'm not quite sure how maintain-able it is to introduce a new interface that needs to be implemented.
I think it's more simpler/practical to do what the serde_json
is doing.
Every data type that uses Document would need to accept generic parameters
Why? I'm suggesting to do what serde_json
does here: https://docs.rs/serde_json/latest/serde_json/fn.from_reader.html
This would give users to choice to use Document or Value or HashMap or anything else.
I thought you meant something like this.
enum Document<T: GenericMap = HashMap<String, Document>>
I don't think I understand you correctly, would you mind giving me an example?
That would be nice to have a default generic parameter to allow users to choose the kind of map, but when I said T: Deserialize
I meant that in place of Document altogether.
There are so many parallels to serde_json::Value
here, I'm wondering why we couldn't allow users to choose the entire type themselves without a need for a wrapper enum like Document or Value.
If I use a Smithy structure, I get a Smithy type which is a struct. If I use a Smithy enum, I get a Smithy type.
If I choose String or Number or bool, however, I get a standard Rust type.
I'm wondering if a Smithy model Document type could be more of a generic itself than a concrete type, and deserialize to a type I have defined in my own crate rather than one Smithy defined.
This would allow users to choose the specific map implementation used and have the benefit of not being subject to the orphan rule.
E.g. the Smithy model type is generic T
(fake keyword I just invented). The codegen for Rust would produce T: Deserialize
if used in an input type in the model.
The route handler for the server, for example, would accept a function which accepts an argument of any type which implements Deserialize.
Then, as a user, I can write my handler fn handler(input: Whatever) -> Result<...>
and it will be deserialize to a struct Whatever { .. }
.
To phrase it slightly differently, the point of serde_json is to provide polymorphism over anything which can be serialized into or deserialized from JSON.
The Value enum is only one form of this polymorphism. Generics are another, more powerful and flexible form.
If Smithy is providing the enum, could it also provide the generic?
This would be fixed with #2858 , which is what I was asking for and more 😄
@andrewbanchich Why does #2858 solve this? In #2858, server operation handlers would take in types that implement FromRequest
, but FromRequest
should only be implemented by the generated SDK on the Rust struct corresponding to the operation input.
I had spoken with @hlbarber about this and it sounded like we'd be able to impl FromRequest for our custom types as well. Maybe I misunderstood.
I don't think we should let users implement FromRequest
for their own types. In fact, I think we should doc-hide and seal the trait.
If we allow users to implement FromRequest
, and #2858 lands, users could easily violate their Smithy model.
I think exposing this trait would be very beneficial.
Smithy cannot serve as a source of truth for all validation rules for inputs, so currently our validation is split between our Smithy-generated types and our custom server code, with the server code performing the validation which Smithy cannot express. Because of this, the code we work with has no single source of truth for validation, which is not ideal. I would argue that the inverse is already true: Smithy allows applications to violate their "true" set of validation rules since some cannot be expressed by Smithy.
In this context, I think it would be better to have a single source of truth defined by application code than to have two sources of truth.
Exposing the trait would also mean we can avoid having a ton of code for converting between types, as well as running into the orphan rule when trying to add abstractions over Smithy types which don't exist.
Because
Document
usesHashMap
for objects instead ofserde_json::Map
, there needs to be a second pass at converting thatHashMap
to aserde_json::Map
when using libraries which work withserde_json::Value
s, and then another for turning it back to aHashMap
/Document
for use in Smithy output types.serde_json::Value
is the standard for the Rust ecosystem, so I expect a lot of crates will need to do this.