Closed jeff-davis closed 10 years ago
This is the right place to discuss it :smiley:
It is absolutely true that if a type can be transformed in one direction, it should probably be able to be transformed in the other. However, the traits end up being separate for technical reasons. Take, str
for example. If we're turning a Postgres VARCHAR
value into a Rust string, we have to return a ~str
, so we need to implement FromSql
for ~str
. But, we want to be able to turn any Rust string into a VARCHAR
value, whether or not it is an owned string (~"foo"
is an owned string and "foo"
is a static string). The way to handle this is to implement ToSql
for borrowed pointers to strings (&'self str
). So, we end up with
impl FromSql for ~str { ... }
impl<'self> ToSql for &'self str { ... }
The binary versus textual representation for data comes from the Postgres communication protocol. The basic flow when executing a statement goes like:
So, ToSql
implementations can choose what format they want to encode to, but FromSql
implementations can't since the driver doesn't know which implementation is going to be used to decode the result until the results have already been transmitted over.
The way that the formats for result values are specified is currently a bit hacky. Any type that the driver handles natively (INT, BIGINT, VARCHAR, etc) is transferred in binary format. Everything else is transferred in text format. You can call the result_format
method on the PostgresType
value passed to from_sql
to determine what format the data is in.
I'm not totally sure what the right thing to do with the representation of unknown Postgres types. Text seems safest, but it might be worth sending everything over in binary representation.
Thank you for the information -- still absorbing some of it, as I don't really understand the lifetime parameters yet.
I know a bit about the postgres protocol already, having worked on other drivers, though mostly I know it from libpq.
"FromSql implementations can't since the driver doesn't know which implementation is going to be used to decode the result until the results have already been transmitted over."
That's true, but the driver can request binary/text format for individual fields of the result, so there is some level of control there. On the other hand, not all types in postgres are required to offer send()/recv() (needed for binary representation), so we don't want to rely on binary. These are just some thoughts, I'm not really trying to make a point here.
"Any type that the driver handles natively (INT, BIGINT, VARCHAR, etc) is transferred in binary format. Everything else is transferred in text format."
That doesn't sound too bad to me, but I would like to see it be more flexible.
I would like to be able to use range types effectively with this driver. That may just mean that the type handling should be flexible enough that I just need to create a range type in rust and implement the traits; or we may want to include support in the driver itself if there is some reason (or if they are popular enough). Being able to use binary representations would certainly be nice with range types, because it would avoid the parsing overhead (which can be substantial for a TSTZRANGE
).
More questions to come. I am trying to learn rust, and improve postgres support while I'm at it, if I can.
Interesting, I didn't know that binary format was optional for custom types. Do you know if the backend will fall back to the text representation if the type doesn't implement a binary interface or will it raise an error?
I'm happy to add to the built in type support as long as there's a reasonable type in the Rust standard library to translate to. I see a couple of options for the various range types. The simplest one is just a 2-tuple (e.g. (i32, i32)
for INT4RANGE
). It seems like a generic Range<T>
type would be better, though I'm a bit unsure if it should live in the Postgres driver or not. Support for the INTERVAL
type is waiting on the rewrite of Rust's time library to add some sort of duration type.
"Do you know if the backend will fall back to the text representation if the type doesn't implement a binary interface or will it raise an error?"
It will raise an error.
At the protocol level, you can request individual result fields to be binary or text, but it's a little tricky because you don't know what types you'll be getting back unless:
So, I think that's why libpq punts and forces you to just ask for entirely binary or entirely text.
Grumble...
I think what we should do is just always do everything in binary if the user wants type conversions. We can offer a fallback method that just returns raw text or binary results with no conversions, as an alternative for the 1% of cases that need it.
Simple... if a user wants to support a new type, they just need to define the conversions from a rust type to binary and back.
"I'm happy to add to the built in type support as long as there's a reasonable type in the Rust standard library to translate to."
I'm guessing probably not then. Range types in postgres are fairly comprehensive; allowing either bound to be missing (infinite) as well as supporting discrete (e.g. INT4RANGE
) and continuous ranges (e.g. NUMRANGE
).
I'd be happy if the driver could be extended with extra types, and was flexible enough that a Range
I'm okay with forcing binary. If a user really wants a textual representation, they can cast to VARCHAR
or whatever in their query. I've also added support for INT4RANGE
, INT8RANGE
, TSRANGE
and TSTZRANGE
. The Range
type still needs some fleshing out, though.
@sfackler I'm new so forgive me if I've missed docs somewhere, but how do you specify in .query
that you are inserting a value of type INT4RANGE
?
Through the one documented example I've seen, I understand how int and string values work, but it's not obvious to me what more is needed for more complex types.
Thanks and I'm responding here since you mentioned implementing INT4RANGE
above.
First of all, please tell me if there is a better forum for discussion like this. Also, as a disclaimer, I am very new to rust.
A type is pretty useless unless it has conversions in both directions, right? So why not just make one trait for both directions?
Also, what is the model for binary versus text representations? Going in, it seems like the type can represent itself as either text or binary; but coming back, the type needs to either support both, or at least declare which one it does support.