scylladb / python-driver

ScyllaDB Python Driver, originally DataStax Python Driver for Apache Cassandra
https://python-driver.docs.scylladb.com
Apache License 2.0
72 stars 42 forks source link

cqlengine: Remove deepcopy on UserType deserialization #277

Closed k0machi closed 9 months ago

k0machi commented 11 months ago

This change makes it so newly instanced UserType during deserialization isn't immediately copied by deepcopy, which could cause huge slowdown if that UserType contains a lot of data or nested UserTypes, in which case the deepcopy calls would cascade as each to_python call would eventually clone parts of source object. As there isn't a lot of information on why this deepcopy is here in the first place this change could potentially break something. Running integration tests against this commit does not produce regressions, so this call looks safe to remove, but I'm leaving this warning here for the future reference.

Fixes #152

Lorak-mmk commented 11 months ago

Do you think you could open this PR against upstream too (datastax/python-driver)? I'd like to see if they see any reason not to merge this.

fruch commented 11 months ago

Do you think you could open this PR against upstream too (datastax/python-driver)? I'd like to see if they see any reason not to merge this.

FYI, no one responded on the issue that was opened: https://datastax-oss.atlassian.net/browse/PYTHON-1309

k0machi commented 10 months ago

Opened in upstream: https://github.com/datastax/python-driver/pull/1192

fruch commented 10 months ago

@Lorak-mmk

There was no response from upstream people yet

I'd say we will merge this performance improvement, since our testing didn't show it breaking anything.

mykaul commented 10 months ago

@Lorak-mmk

There was no response from upstream people yet

I'd say we will merge this performance improvement, since our testing didn't show it breaking anything.

fruch commented 10 months ago

@Lorak-mmk There was no response from upstream people yet I'd say we will merge this performance improvement, since our testing didn't show it breaking anything.

@k0machi was talking about other places there a deepcopy, he wanted to remove, not this PR

  • Do we happen to have performance numbers for this change?

@k0machi did a manual test, with Argus calls, and shown a big improvement across the board.

we don't have a specific performance suite for this driver, especially not for cqlengine.

mykaul commented 10 months ago

we don't have a specific performance suite for this driver, especially not for cqlengine.

A before/after would hopefully suffice, showing reduced latency (I've seen elsewhere in the threads a reduction of seconds - https://github.com/scylladb/python-driver/issues/152#issuecomment-1302355091 ?)

k0machi commented 10 months ago

we don't have a specific performance suite for this driver, especially not for cqlengine.

A before/after would hopefully suffice, showing reduced latency (I've seen elsewhere in the threads a reduction of seconds - #152 (comment) ?)

It doesn't quite show the latency numbers, but this request went down from ~9s avg to 750ms average

mykaul commented 10 months ago

this request went down from ~9s avg to 750ms average

Thanks - this is what I was hoping to see. Flamegraphs are great, but perf results are more important.

mykaul commented 10 months ago

I believe we have a release pending this week - if so, I'd wait after the release, instead of sending it in the last minute.

mykaul commented 9 months ago

@avelanarius - can we merge it? The lack of response from usptream is disappointing, but should not hold us back.