embulk / embulk-output-jdbc

MySQL, PostgreSQL, Redshift and generic JDBC output plugins for Embulk
Other
88 stars 86 forks source link

Add delete insert to tuncate_insert alias #321

Closed naka-sho closed 1 year ago

naka-sho commented 1 year ago

mode:tuncate_insert does not actually use the truncate statement, it executes the delete statement

https://github.com/embulk/embulk-output-jdbc/blob/dfb2192cd0162ac32b9111f1de7f28177e4cf902/embulk-output-jdbc/src/main/java/org/embulk/output/jdbc/AbstractJdbcOutputPlugin.java#L890-L897 https://github.com/embulk/embulk-output-jdbc/blob/dfb2192cd0162ac32b9111f1de7f28177e4cf902/embulk-output-jdbc/src/main/java/org/embulk/output/jdbc/JdbcOutputConnection.java#L386-L417 https://github.com/embulk/embulk-output-jdbc/blob/dfb2192cd0162ac32b9111f1de7f28177e4cf902/embulk-output-jdbc/src/main/java/org/embulk/output/jdbc/JdbcOutputConnection.java#L419-L427

Please add delete_insert with another name because there is a possibility of misunderstanding that rollback is not possible.

I think that misunderstandings can be eliminated by adding the fact that the delete statement is explicitly issued to mode.

dmikurube commented 1 year ago

I basically don't think we'll casually have this kind of aliases, at least in github.com/embulk plugins.

The biggest reason is because such an alias could bring much more confusions rather than reducing misunderstandings. Users would still search for truncate_insert, and maybe for delete_insert, and they would be confused.

We would have such an alias only when we have a migration period for incompatible changes. I don't see it has so much benefits to remove truncate_insert finally. We ultimately cannot remove truncate_insert for compatibility.

In addition, having this kind of aliases would increase the need for us, maintainers, to take care of things. I don't see it would have benefits to spend our efforts, not just now, but for a long time for the future.

naka-sho commented 1 year ago

Thank you for answering.

I understand that you have a specific intention in mind.

I made the following two points as reasons for my revision:

I thought it would be better to change the name if we are not going to execute the TRUNCATE statement. TRUNCATE generally does not allow for data rollback, so I couldn't use truncate_insert without prior knowledge. I was hesitant to use it initially, but after examining the internal logic and realizing it uses delete, I decided to use it without any concerns. A similar discussion can be found in the following blog: https://imagawa.hatenadiary.jp/entry/2019/08/26/190000

At the very least, I would like the README to mention that truncate_insert actually issues a delete statement.

If you don't fix it, you can close it without any problems

hiroyuki-sato commented 1 year ago

Hello, @naka-sho. Thank you for your contribution.

I agree with @dmikurube.

It is explain the truncate_insert is transactional in Mode description.

For avoiding confusion, It would be helpful to add delete from statement in the following doc.

For example.

Current doc.

Proposed change.

naka-sho commented 1 year ago

@hiroyuki-sato

This is a great suggestion. Thank you very much. I will make the modifications here and send another pull request.

naka-sho commented 1 year ago

@dmikurube @hiroyuki-sato

Please reconfirm

https://github.com/embulk/embulk-output-jdbc/pull/324