embulk / embulk-input-jdbc

MySQL, PostgreSQL, Redshift and generic JDBC input plugins for Embulk
Other
102 stars 74 forks source link

call pageBuilder close function to release buffer #225

Open case-k-git opened 3 years ago

case-k-git commented 3 years ago

Summary

pageBuilder.close is not called and could be cause leak issue. add pageBuilder.close to to release buffer (buffer.release())

code

    @Override
    public void close() {
        this.delegate.close();
    }

https://github.com/embulk/embulk/blob/76a0c9922c551f1d7e6b964642a202edab745435/embulk-api/src/main/java/org/embulk/spi/PageBuilder.java#L201

    @Deprecated  // https://github.com/embulk/embulk/issues/1321
    public PageBuilder(BufferAllocator allocator, Schema schema, PageOutput output) {
        this(createImplInstance(allocator, schema, output));
    }

https://github.com/embulk/embulk/blob/76a0c9922c551f1d7e6b964642a202edab745435/embulk-api/src/main/java/org/embulk/spi/PageBuilder.java#L49

    private static PageBuilder createImplInstance(final BufferAllocator allocator, final Schema schema, final PageOutput output) {
        try {
            return Holder.CONSTRUCTOR.newInstance(allocator, schema, output);
        } catch (final IllegalAccessException | IllegalArgumentException | InstantiationException ex) {
            throw new LinkageError("[FATAL] org.embulk.spi.PageBuilderImpl is invalid.", ex);
        } catch (final InvocationTargetException ex) {
            throwCheckedForcibly(ex.getTargetException());
            return null;  // Should never reach.
        }
    }

https://github.com/embulk/embulk/blob/76a0c9922c551f1d7e6b964642a202edab745435/embulk-api/src/main/java/org/embulk/spi/PageBuilder.java#L204

    @Override
    public void close() {
        if (buffer != null) {
            buffer.release();
            buffer = null;
            bufferSlice = null;
        }
        output.close();
    }

https://github.com/embulk/embulk/blob/76a0c9922c551f1d7e6b964642a202edab745435/embulk-core/src/main/java/org/embulk/spi/PageBuilderImpl.java#L247

releated

https://github.com/embulk/embulk/issues/1218#issuecomment-586336178

hiroyuki-sato commented 3 years ago

Hello, @case-k-git.

Thank you for creating this PR.

We know that this problem has been unresolved for a long time. Embulk Core team are investigating how to resolve this issue. If this PR can fix the current problem, please describe it in detail. Otherwise, please use a plugin as is.

dmikurube commented 3 years ago

Thanks for your contribution! But sorry that we have to say please wait for getting this merged.

To be honest, we're not yet 100% confident of :

At the same time, the team is now involved in the v0.10 efforts, then we have no resources to dive deeper into those questions.

case-k-git commented 3 years ago

@hiroyuki-sato @dmikurube

Thank you for your reply and check.

I see . I will be wait until finish to Embulk Core team investigation finished

I am not sure if this is related. previously I have got Buffer detected double release() when using sqlserver-input-plugin and output-bigquery-plugin. this error has been caused also when transfering large amount of data.

and caused when disk space is not enough(by using enough space the error has been gone). https://github.com/embulk/embulk-output-bigquery/issues/138 seems to be resource is not handling well

I am not test it but if create small size of disk using docker in local env same error could be make without large amount of data. using sqlserver-input-plugin and output-bigquery-plugin (I am not sure output-elasticsearch-plugin)

hope this gonna help to investigate

thank you

JPWon-1 commented 2 years ago

I have same issue when i try to move data from redis to bigquery

so like @case-k-git said, it could be embulk core issue or it could be embulk-output-bigquery issue

JongpilW0N commented 2 years ago

I have same issue when i try to move data from redis to bigquery

so like @case-k-git said, it could be embulk core issue or it could be embulk-output-bigquery issue

*redis to bigquery -> bigquery to redis so it was not embulk-output-bigquery, but embulk-input-bigquery

anyway i think bigquery plugin has some issues