seanharr11 / etlalchemy

Extract, Transform, Load: Any SQL Database in 4 lines of Code.
MIT License
555 stars 83 forks source link

column_schema_transformation_file not work? #19

Closed alraben closed 7 years ago

alraben commented 7 years ago

Hi,

I'm testing the project migrating a oracle database to mysql database and I have problem with a datatype of many columns. By defect some columns are detected like "VARCHAR(1)" instead "VARCHAR(2)". I don't know the reason.

I have a file called column_mappings.csv with the following information:

idcolegio,colegiado,idcolegio,varchar(2),False

"idcolegio" is the column of the table "colegiado" on the source "colegiado" is the column of the table on the source "idcolegio" is the name of the new column on the target "varchar(2)" is the type of the column idcolegio in the target

and in the migration.py script I have the following:

oracle_db_source = ETLAlchemySource("oracle+cx_oracle://CGCOM_20170401:CGCOM_20170401@10.10.30.75/xe",included_tables=["colegiado"],column_schema_transformation_file="/Users/alraben/Documents/CGCOM/alvaro/Migracion_Oracle_to_MySQL/transformations/column_mappings.csv")

but not work the information of this file. No errors on file and its path.

Where is the problem?

Thanks.

seanharr11 commented 7 years ago

Hi @alraben - this is most likely due to ETLAlchemy 'minimizing' the size of VARCHAR columns. The basic logic is:

  1. Scan through all rows in the table
  2. For each column, find the MAX sized string in each VARCHAR column
  3. Set the column size (on the target) to be the nearest_power_of_two of that column's max string size

To override this, install the package from github (the PyPi package is out-of-date). There is now a compress_varchar argument of the ETLAlchemySource.__init__() function, which defaults to False, which is what you want.

Try this and let me know how it works!

alraben commented 7 years ago

I made you said me and it was ok.

But I have a doubt:

pip etlalchemy versión is 1.1

git version 1.0.6

etlalchemy (1.0.6) How is it possible?

Thanks!

2017-05-12 14:18 GMT+02:00 Sean Harrington notifications@github.com:

Hi @alraben https://github.com/alraben - this is most likely due to ETLAlchemy 'minimizing' the size of VARCHAR columns. The basic logic is:

  1. Scan through all rows in the table
  2. For each column, find the MAX sized string in each VARCHAR column
  3. Set the column size (on the target) to be the nearest_power_of_two of that column's max string size

To override this, install the package from github (the PyPi package is out-of-date). There is now a compress_varchar argument of the ETLAlchemySource.init() function, which defaults to False, which is what you want.

Try this and let me know how it works!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/seanharr11/etlalchemy/issues/19#issuecomment-301061513, or mute the thread https://github.com/notifications/unsubscribe-auth/AUJEQMbx6pf7kdMkJEYRlmSQjXbpJH2Mks5r5E4vgaJpZM4NZKxI .

-- Álvaro Ramiro Benavides alraben(arroba)gmail(dot)com

alraben commented 7 years ago

I've installed in this url:

https://github.com/seanharr11/etlalchemy.git

Thanks!

2017-05-16 13:19 GMT+02:00 Alvaro Ramiro Benavides alraben@gmail.com:

I made you said me and it was ok.

But I have a doubt:

pip etlalchemy versión is 1.1

git version 1.0.6

etlalchemy (1.0.6) How is it possible?

Thanks!

2017-05-12 14:18 GMT+02:00 Sean Harrington notifications@github.com:

Hi @alraben https://github.com/alraben - this is most likely due to ETLAlchemy 'minimizing' the size of VARCHAR columns. The basic logic is:

  1. Scan through all rows in the table
  2. For each column, find the MAX sized string in each VARCHAR column
  3. Set the column size (on the target) to be the nearest_power_of_two of that column's max string size

To override this, install the package from github (the PyPi package is out-of-date). There is now a compress_varchar argument of the ETLAlchemySource.init() function, which defaults to False, which is what you want.

Try this and let me know how it works!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/seanharr11/etlalchemy/issues/19#issuecomment-301061513, or mute the thread https://github.com/notifications/unsubscribe-auth/AUJEQMbx6pf7kdMkJEYRlmSQjXbpJH2Mks5r5E4vgaJpZM4NZKxI .

-- Álvaro Ramiro Benavides alraben(arroba)gmail(dot)com

-- Álvaro Ramiro Benavides alraben(arroba)gmail(dot)com

seanharr11 commented 7 years ago

Hi @alraben - I released yesterday on PyPi, and just updated the release on Github. If you try to install from github again, it will be in-sync with PyPi! Are we OK to close this issue?

alraben commented 7 years ago

Yes.

Thanks

Close the issue

El 16 may. 2017 17:33, "Sean Harrington" notifications@github.com escribió:

Hi @alraben https://github.com/alraben - I released yesterday on PyPi, and just updated the release on Github. If you try to install from github again, it will be in-sync with PyPi! Are we OK to close this issue?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/seanharr11/etlalchemy/issues/19#issuecomment-301821003, or mute the thread https://github.com/notifications/unsubscribe-auth/AUJEQIs5ggmbFYkBsgJieSHtWJh06m7eks5r6cHMgaJpZM4NZKxI .