Why are ToSql and FromSql different traits?

jeff-davis commented 11 years ago

First of all, please tell me if there is a better forum for discussion like this. Also, as a disclaimer, I am very new to rust.

A type is pretty useless unless it has conversions in both directions, right? So why not just make one trait for both directions?

Also, what is the model for binary versus text representations? Going in, it seems like the type can represent itself as either text or binary; but coming back, the type needs to either support both, or at least declare which one it does support.

sfackler commented 11 years ago

This is the right place to discuss it :smiley:

It is absolutely true that if a type can be transformed in one direction, it should probably be able to be transformed in the other. However, the traits end up being separate for technical reasons. Take, str for example. If we're turning a Postgres VARCHAR value into a Rust string, we have to return a ~str, so we need to implement FromSql for ~str. But, we want to be able to turn any Rust string into a VARCHAR value, whether or not it is an owned string (~"foo" is an owned string and "foo" is a static string). The way to handle this is to implement ToSql for borrowed pointers to strings (&'self str). So, we end up with

impl FromSql for ~str { ... }
impl<'self> ToSql for &'self str { ... }

The binary versus textual representation for data comes from the Postgres communication protocol. The basic flow when executing a statement goes like:

Prepare the query string. The driver sends the database the query string and the database tells the driver the formats of the parameters and results.
Bind parameters to the statement. At this point, the driver sends the database all of the parameter values along with their formats as well as the requested return formats for the result values.
Execute. The database sends the resulting rows in the format specified at bind time.

So, ToSql implementations can choose what format they want to encode to, but FromSql implementations can't since the driver doesn't know which implementation is going to be used to decode the result until the results have already been transmitted over.

The way that the formats for result values are specified is currently a bit hacky. Any type that the driver handles natively (INT, BIGINT, VARCHAR, etc) is transferred in binary format. Everything else is transferred in text format. You can call the result_format method on the PostgresType value passed to from_sql to determine what format the data is in.

I'm not totally sure what the right thing to do with the representation of unknown Postgres types. Text seems safest, but it might be worth sending everything over in binary representation.

jeff-davis commented 11 years ago

Thank you for the information -- still absorbing some of it, as I don't really understand the lifetime parameters yet.

I know a bit about the postgres protocol already, having worked on other drivers, though mostly I know it from libpq.

"FromSql implementations can't since the driver doesn't know which implementation is going to be used to decode the result until the results have already been transmitted over."

That's true, but the driver can request binary/text format for individual fields of the result, so there is some level of control there. On the other hand, not all types in postgres are required to offer send()/recv() (needed for binary representation), so we don't want to rely on binary. These are just some thoughts, I'm not really trying to make a point here.

"Any type that the driver handles natively (INT, BIGINT, VARCHAR, etc) is transferred in binary format. Everything else is transferred in text format."

That doesn't sound too bad to me, but I would like to see it be more flexible.

I would like to be able to use range types effectively with this driver. That may just mean that the type handling should be flexible enough that I just need to create a range type in rust and implement the traits; or we may want to include support in the driver itself if there is some reason (or if they are popular enough). Being able to use binary representations would certainly be nice with range types, because it would avoid the parsing overhead (which can be substantial for a TSTZRANGE).

More questions to come. I am trying to learn rust, and improve postgres support while I'm at it, if I can.

sfackler commented 11 years ago

Interesting, I didn't know that binary format was optional for custom types. Do you know if the backend will fall back to the text representation if the type doesn't implement a binary interface or will it raise an error?

I'm happy to add to the built in type support as long as there's a reasonable type in the Rust standard library to translate to. I see a couple of options for the various range types. The simplest one is just a 2-tuple (e.g. (i32, i32) for INT4RANGE). It seems like a generic Range<T> type would be better, though I'm a bit unsure if it should live in the Postgres driver or not. Support for the INTERVAL type is waiting on the rewrite of Rust's time library to add some sort of duration type.

jeff-davis commented 11 years ago

"Do you know if the backend will fall back to the text representation if the type doesn't implement a binary interface or will it raise an error?"

It will raise an error.

At the protocol level, you can request individual result fields to be binary or text, but it's a little tricky because you don't know what types you'll be getting back unless:

you describe it and wait for the response before executing (which would change the flow and introduce latency, if I'm not mistaken); or
you have some higher-level information from the application about what it's expecting.

So, I think that's why libpq punts and forces you to just ask for entirely binary or entirely text.

Grumble...

I think what we should do is just always do everything in binary if the user wants type conversions. We can offer a fallback method that just returns raw text or binary results with no conversions, as an alternative for the 1% of cases that need it.

Simple... if a user wants to support a new type, they just need to define the conversions from a rust type to binary and back.

"I'm happy to add to the built in type support as long as there's a reasonable type in the Rust standard library to translate to."

I'm guessing probably not then. Range types in postgres are fairly comprehensive; allowing either bound to be missing (infinite) as well as supporting discrete (e.g. INT4RANGE) and continuous ranges (e.g. NUMRANGE).

I'd be happy if the driver could be extended with extra types, and was flexible enough that a Range could be used.

sfackler commented 11 years ago

I'm okay with forcing binary. If a user really wants a textual representation, they can cast to VARCHAR or whatever in their query. I've also added support for INT4RANGE, INT8RANGE, TSRANGE and TSTZRANGE. The Range type still needs some fleshing out, though.

ghost commented 3 years ago

@sfackler I'm new so forgive me if I've missed docs somewhere, but how do you specify in .query that you are inserting a value of type INT4RANGE?

Through the one documented example I've seen, I understand how int and string values work, but it's not obvious to me what more is needed for more complex types.

Thanks and I'm responding here since you mentioned implementing INT4RANGE above.

sfackler / rust-postgres

Why are ToSql and FromSql different traits? #21