Open fzoli opened 4 years ago
Thanks for your proposal. How does this work right now with PGJDBC? Ideally, we can reuse Blob
as the streaming interface is already available there.
Connection
of PGJDBC provides LargeObjectManager
via connection.unwrap(PGConnection.class).getLargeObjectAPI()
Methods of LargeObjectManager
:
LargeObject open(long oid, int mode)
opens an existing large object, based on its OIDlong create(int mode)
creates a large object, returning its OIDvoid delete(long oid)
deletes the large objectArgument mode
can be READ
, WRITE
, READWRITE
LargeObject
is a stream-like object with seek
support.
Methods of LargeObject
:
OutputStream getOutputStream()
returns an OutputStream
to the large objectInputStream getInputStream(long limit)
returns an InputStream
that will limit the amount of data that is visibleInputStream getInputStream()
returns an InputStream
from the large objectbyte[] read(int length)
reads some data from the object, and return as a byte[] arrayint read(byte[] buffer, int offset, int length)
reads some data from the object into an existing arrayvoid write(byte[] buffer)
writes an array to the objectvoid write(byte[] buffer, int offset, int length)
writes some data from an array to the objectseek(long position, int reference)
sets the current position within the objectlong tell()
the current position within the objectlong size()
the size of the large object - inefficient, as the only way to find out the size of the object is to seek to the end, record the current position, then return to the original positionArgument reference
can be SEEK_SET
, SEEK_CUR
, SEEK_END
Internally LargeObject
uses Fastpath
API.
I reconsidered my request. LargeObject
is memory efficient because the client code uses the same byte array in a loop. With reactive stream data is immutable so we can consume only ByteBuffer
chunks. With a large file maybe the overhead of the chunks are worse than the memory requirement of a new thread that can use the blocking API. Or not?
Thanks for the details. The Large Object API is pretty different from a plain Blob, so it seems using io.r2dbc.Blob
isn't necessarily a suitable approach.
is memory efficient because the client code uses the same byte array in a loop
With this API being a driver-specific API, we can use Netty's ByteBuf
which can be pooled buffers so we have GC pressure covered. Looking at PGJDBC's LargeObject
we should be able to mirror the same API with using reactive types along (e.g. Mono<ByteBuf> read(int length)
, ByteBufFlux read()
corresponding withgetInputStream()
, Mono<Long> write(Flux<ByteBuf>)
corresponding with getOutputStream()
).
The FastPath API is largely deprecated with the background of using parameterized statements. Maybe @davecramer can shed a bit of light here.
@mp911de what is the question ? see https://www.postgresql.org/docs/current/libpq-fastpath.html for the reasoning for the deprecation
The actual question is how the replacement should look like. I wasn't able to find an example how to call functionality required for the Large Object API such as lo_close
, loread
, lo_tell
and others via SQL.
Interesting looks like the driver still uses FastPath and we don't have any tests for it. Honestly I don't use it but could probably figure it out when I have some time. If they happen to figure it out first that would be awesome
Thanks, Dave. I marked this ticket as ideal for contribution.
It makes sense to add such a feature to our library and we can work out the actual API when we receive a design proposal in form of a PR.
@fzoli You can call any of those functions without driver's explicit support. Take a look here.
But there's one bad thing in reading: backend will send data by 65536 bytes chunks, but our driver won't decode it until it will receive full packet. So we have to rework our receiving process to support lazy decoding or we have to call lo_get
multiple times with reasonable for
parameter until the whole file is here.
In JDBC, the OID-based large objects are exposed as java.sql.Blob
. I've not really looked at R2DBC, but I don't see why the io.r2dbc.Blob
interface wouldn't be suitable for the common case (read/write a stream).
That was at first also my thought. The large object API allows for several additional methods such as seeking, truncation, length introspection. We need a bit of code to see how this enhancement goes and how the API shapes up.
So is LO support a necessity? I don't really see many people using this in JDBC ?
I don't know as I'm not so much involved in how folks use Postgres with large objects. I don't mind making sure our driver is able to work with server-side functions so that this feature could live also outside of the driver.
Makes sense I guess. I probably wouldn't prioritize it.
As an actual Postgres user, blob support is crucial. A bytea is not a blob.
This is like arguing you don’t need to support char because you don’t see many people using it.
Speaking from some authority as primary maintainer for the PostgreSQL JDBC driver and major contributor for Postgres and long time user (since 1999) I honestly don't see a lot of use of large objects. Now perhaps I am myopic, I don't know. That said I didn't say don't support it, I said I wouldn't prioritize it.
If by "large objects" you mean LargeObject
s then perhaps that's fair, but most people will use large objects in Java via Blob
.
Yes I was referring to LargeObjects.
Feature Request
PostgreSQL supports two type of large object:
R2DBC SPI
supports it withBlob
)SPI
has no such support)Here are the differences:
Stream-style makes it possible to implement a server that supports HTTP range requests. Is it possible to create an extension for
r2dbc-postgresql
to support LargeObject with OID?Describe the solution you'd like
Blob
from theSPI
can not be used because data manipulation of a LargeObject is not possible with insert/update clauses.io.r2dbc.postgresql.api.PostgresqlConnection
could provide something like:Publisher<ByteBuffer> readLargeObject(long oid)
Publisher<ByteBuffer> readLargeObject(long oid, Range range)
Mono<Long> createLargeObject(Publisher<ByteBuffer>)
Mono<Void> deleteLargeObject(long oid)
The
Range
can be described with two attribute:long
starts from zerolong
maximum number of requested bytes (optional)