pgjdbc / r2dbc-postgresql

Postgresql R2DBC Driver
https://r2dbc.io
Apache License 2.0
1.01k stars 177 forks source link

Performance issue with PostgresqlRow.getColumn(String name) when select many columns #636

Closed yuki-teraoka closed 8 months ago

yuki-teraoka commented 9 months ago

When selecting many fields, a lot of CPU is consumed because the fields are looped.

In one case I encountered, with a select of millions of records with about 700 fields, I forced the query to stop because it was consuming 100% CPU for several hours.

At this time, I checked Java's stack trace many times, and most of them were processing the following for statement.

https://github.com/pgjdbc/r2dbc-postgresql/blob/d047276aec03b9d691cb343789dc8d933e0dad7f/src/main/java/io/r2dbc/postgresql/PostgresqlRow.java#L168-L174

Would it be possible to prepare field name and order mappings in advance?

mp911de commented 9 months ago

This is possible. PGJDBC uses a lazy approach to create the column name to index map, see https://github.com/pgjdbc/pgjdbc/blob/master/pgjdbc/src/main/java/org/postgresql/jdbc/PgResultSet.java#L3108-L3136

Happy to merge a pull request.