Closed mirelagrigoras closed 5 years ago
I like this approach since values would get rejected still providing feedback to users.
@mirelagrigoras what would you think about providing a default from the configuration and using that? ie, a user could specify default_column_width
.
@awm33 thoughts here?
To me, this seems to be a great idea.This way, the target could be customized by the user according to their needs.
În Sâm, 9 feb. 2019, 01:26 Alexander Mann <notifications@github.com a scris:
I like this approach since values would get rejected still providing feedback to users.
@mirelagrigoras https://github.com/mirelagrigoras what would you think about providing a default from the configuration and using that? ie, a user could specify default_column_width.
@awm33 https://github.com/awm33 thoughts here?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/datamill-co/target-redshift/pull/9#issuecomment-461981646, or mute the thread https://github.com/notifications/unsubscribe-auth/AtMRVGeedfRbEPzpzOHpfFQh6E_jSTPgks5vLgesgaJpZM4avkpL .
I like this approach since values would get rejected still providing feedback to users.
@mirelagrigoras what would you think about providing a default from the configuration and using that? ie, a user could specify
default_column_width
.@awm33 thoughts here?
I added the default length in the configuration file. Let me know what you think @awm33 @AlexanderMann
Thanks!
@mirelagrigoras This great! Thank you for your contribution! I just noticed a one line issue, if you could take a look
@mirelagrigoras Great! Approved! We'll get this merged and deployed soon
@awm33 Actually, let me explain what was the logic when I set the value to self.MAX_LENGTH, because I think this is the correct way to handle the situation.
So, from what I understand, in the schema there might be a key "maxLength". If there is, max_length will be sent to that value, otherwise, it will be set to self.default_column_length. However, if the key exists in the dictionary and its value is greater than the default, then we should set max_length to self.MAX_VARCHAR since setting it to a lower value could cause the import of the data into the table to fail (resulting into a String length exceeds DDL length
error).
I migth be wrong, so please, let me know if I misunderstood something.
@mirelagrigoras Ah, I see, so if maxLength
is set, it overrides the default. If that's the intent, then I think I would have expected:
max_length = schema.get('maxLength', self.default_column_length)
if max_length > self.MAX_VARCHAR:
max_length = self.MAX_VARCHAR
This way, there is a default, and it can be overridden by maxLength
in the schema, but we don't necessarily set it to MAX_VARCHAR when it's greater than the default. For example if the default was 2000, and maxLength
was 2500, we would set it to 64k.
@awm33 That is a great catch.This should be the expected behaviour.I have changed the condition.Thank you for your time!
În Dum, 10 feb. 2019, 00:25 Andrew Madonna <notifications@github.com a scris:
@mirelagrigoras https://github.com/mirelagrigoras Ah, I see, so if maxLength is set, it overrides the default. If that's the intent, then I think I would have expected:
max_length = schema.get('maxLength', self.default_column_length) if max_length > self.MAX_VARCHAR: max_length = self.MAX_VARCHAR
This way, there is a default, and it can be overridden by maxLength in the schema, but we don't necessarily set it to MAX_VARCHAR when it's greater than the default. For example if the default was 2000, and maxLength was 2500, we would set it to 64k.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/datamill-co/target-redshift/pull/9#issuecomment-462084900, or mute the thread https://github.com/notifications/unsubscribe-auth/AtMRVOELsZe1OhDzb8Gyla90DEPaVLy-ks5vL0q9gaJpZM4avkpL .
@mirelagrigoras looks good. I'm going to go ahead and merge this. At the beginning of next week, if there are no other changes, I'll get a release of this ready to go!
@mirelagrigoras version 0.0.3 is now live!
Thank you both so much!
În Lun, 18 feb. 2019, 17:32 Alexander Mann <notifications@github.com a scris:
@mirelagrigoras https://github.com/mirelagrigoras version 0.0.3 is now live! https://pypi.org/project/target-redshift/0.0.3/
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/datamill-co/target-redshift/pull/9#issuecomment-464776478, or mute the thread https://github.com/notifications/unsubscribe-auth/AtMRVFaCx5E-BqC5T_U7CrpV0z5CeG4zks5vOsd7gaJpZM4avkpL .
When working with a large number of columns, setting the column length as the maximum varchar length( 65,535 characters) could cause an error:
target_postgres.postgres.PostgresError: ('Exception writing records', QueryCanceledError('Intermediate result row exceeds database block size\n',))
This is why I suggest decreasing the default (also, this would decrease the table size, so, the cost of our Redshift cluster). Yes, this is still a limitation, but sometimes is better to start by using a smaller column length and increase it afterwards(if needed) than the opposite.