prisma / tiberius

TDS 7.2+ (Microsoft SQL Server) driver for Rust
Apache License 2.0
321 stars 118 forks source link

TokenRow::with_capacity should be associated function #254

Closed aeshirey closed 1 year ago

aeshirey commented 1 year ago

The current implementation of TokenRow::with_capacity is a method and not an associated function:

/// Creates a new empty row with allocated capacity.
pub fn with_capacity(&self, capacity: usize) -> Self {
    Self {
        data: Vec::with_capacity(capacity),
    }
}

This means that you can't call TokenRow::with_capacity(100):

56  |     pub fn with_capacity(&self, capacity: usize) -> Self {
    |            ^^^^^^^^^^^^^
help: provide the argument
    |
211 |             let mut args = TokenRow::with_capacity(/* &TokenRow<'_> */, 100);
    |                            ~

It appears that the first &self arg should be dropped here?

Additionally, when using TokenRow, would it be appropriate to have a .clear() method to reuse the same alocated TokenRow and thus inner Vec?

pimeys commented 1 year ago

I'm trying to understand why and where this method was added. It's weirdly a public API, maybe for the bulk inserts.

To what are you using it for? I will definitely clean the API now and cut a release later this morning.

aeshirey commented 1 year ago

I have a project that's using Tiberius to interact with SQL Server, and inserting ~200k records takes 15 minutes (or maybe it was 30, I don't recall). So I'm attempting to do a bulk insert to speed this up. I don't know yet where the bottleneck is, but I noticed that I can't pre-allocate a buffer for these rows.

Unrelated to this issue, but I noticed that the ergonomics around datetime types is really difficult. Maybe I'm just misunderstanding the API, but trying to convert a chrono DateTime<Utc> to a SQL Server datetime2 isn't obvious. Using the .to_sql() method (and the ToSql trait), I encounter the following error when trying to execute a (bulk) insert:

Tiberius("BULK UPLOAD input failure: invalid data type, expecting Some(VarLenSized(VarLenContext { type: Datetime2, len: 7, collation: None })) but found DateTimeOffset(Some(DateTimeOffset { datetime2: DateTime2 { date: Date(738429), time: Time { increments: 78840000000, scale: 7 } }, offset: 0 }))")

pimeys commented 1 year ago

I'm willing to take docs PRs if you find problems!

For your issue, DateTime<Utc> needs datetimeoffset in the database, in datetime2 you'll lose the offset part silently, which is something we want to avoid. Read the type mappings from the docs.

For your case, the correct chrono type is NaiveDateTime.

pimeys commented 1 year ago

In general, I don't like silent data corruption... Like accepting f64 to real columns. Better to error out loud than have wrong data. The error you posted is horrible though, if you want you could maybe try to craft me a PR to make it look nicer!

aeshirey commented 1 year ago

I'm happy to provide examples and/or documentation once I get my code figured out (possibly with table schema redesign, as I naively assumed datetime2 <-> DateTime<Utc> to be interoperable).

pimeys commented 1 year ago

Take your time to read the docs, try out things. I know Tiberius is very low level, because our company needs it to work seamlessly in many scenarios and we build higher level abstractions on top of it. I've been thinking of building a batteries included version of it with a chosen runtime (Tokio), pooling, a nice transaction API and all that. Just don't have time after work and at work there are more urgent things to do.

aeshirey commented 1 year ago

Ooh, I like the idea of a batteries-included wrapper. I'd definitely consider contributing, as I spend a lot of time trying to figure those kinds of things out for easier, higher-level use.