jOOQ / jOOQ

jOOQ is the best way to write SQL in Java
https://www.jooq.org
Other
6.1k stars 1.2k forks source link

Add support for Loader.bulkAfter & Loader.onDuplicateKeyIgnore together #12740

Open juriad opened 2 years ago

juriad commented 2 years ago

Use case:

There are only a few banned combinations that are not even documented in JavaDoc (LoaderImpl):

    private final void checkFlags() {
        if (bulk != BULK_NONE && onDuplicate != ON_DUPLICATE_KEY_ERROR)
            throw new LoaderConfigurationException("Cannot apply bulk loading with onDuplicateKey flags. Turn off either flag.");
    }

I just got this exception and set myself to find out why. It looks pretty strange but maybe it is just from the point of view of my preferred database (Postgres). This indeed works correctly:

create table pk (i int primary key);

insert into pk values (1), (2), (3);
insert into pk values (4), (5), (3) on conflict do nothing ;
select * from pk;

It inserts 1,2,3,4,5 and reports 3 and 2 affected rows.

Is this condition still necessary? Can it be loosened, so that it is dialect-specific?

There used to be another condition that targeted batching; that one was removed in https://github.com/jOOQ/jOOQ/commit/84ad8608787022444710629ba8230929fcbd4f9f. Doesn't the same solution also work for bulking? The JavaDoc on InsertQuery.onDuplicateKeyIgnore does not mention any exclusions for multiple values.

(Also, it would be nice if not only primary key but also unique keys were considered, but that is a different story...)

Possible solution you'd like to see:

Add support for bulking together with onDuplicateKeyIgnore (at least for dialects where it is possible).

Possible workarounds:

Turn off either flag as suggested.

Versions:

lukaseder commented 2 years ago

Thanks for your suggestion.

Agreed, the ON DUPLICATE KEY IGNORE flag should work without any issues when there's native support (in MySQL style or PostgreSQL style dialects). It might even work with the INSERT .. SELECT emulation. I'll check it out.

When this was implemented (a long time ago), I suspect:

I'll research this a bit further, but probably it was something that didn't work at the time, years ago, and might work now.

ncheema commented 1 year ago

@lukaseder, this feature has been expanding its scope for some time now. Is there a chance that we could see it in the next release? The bulk feature is incredibly potent, particularly for scenarios where ingestion speed is critical. At present, we manage by utilizing batchMerge but manually determining the bulk, batch, and commit sizes. Usecase: load data with onDuplicateKeyUpdate() condition. Note it be valuable if batchX APIS also has throttling options as load apis.