radbrt / target-oracle

Singer.io compatible target for Oracle databases
0 stars 14 forks source link

Create CLOB column type #11

Open gallejesus opened 10 months ago

gallejesus commented 10 months ago

Hi,

I'm trying to create a column for a string that exceeds 4k characters. Sadly, I cannot increase max string size in my target oracle db so the alternative would be to store this field in a CLOB type.

I have been taking a look about the types supported and cannot find a way to create this column.

Thank you!

radbrt commented 10 months ago

I was waiting for this to become an issue. The singer spec has no distinction between large and small strings, and certainly not on the arbitrary 4000 char limit in Oracle. The best we can hope for is for the tap to specify a max-length - but that is mostly just for database taps.

Two things we can do:

  1. add an if statement so that when maxlength is over 4000 it creates a clob.
  2. Add a config to write all strings as clobs (typically when loading csv files or other sources that don't give any guarantees about string length.

What do you think?

gallejesus commented 10 months ago

The first options was the first idea I tried and it worked changing it in my local, but it is true that it reduces flexibility. I think the second one is the most complete but I guess it depends on the amount of work required to complete it.

Maybe an approach would be to add the first option to unblock it for the moment and add the second one later?

If I'm not missing anything, the first option would be to add something like this in sinks.py:97, right?

maxlength = jsonschema_type.get("maxLength", 4000)
if maxlength > 4000:
    return cast(
        sqlalchemy.types.TypeEngine, sqlalchemy.types.CLOB()
    )
else:
    return cast(
        sqlalchemy.types.TypeEngine, sqlalchemy.types.VARCHAR(maxlength)
    )
gallejesus commented 10 months ago

Hello! Any thoughts on this? Thank you :)

radbrt commented 10 months ago

Hi @gallejesus, thanks for the reminder. I have created a new branch, clobbing, which creates a CLOB if the maxLength on strings is over 4000, and also a strings_as_clobs config option to always force strings to clobs.

I have tested the maxLength, but not the strings_as_clobs option yet. You can try it yourself if you add @clobbing to your pip_url, and run meltano install --clean. Let me know if it works for you.