Middlecon / DBImport

DBImport ingestion tool. Handle import, export and standard ETL flows in Hadoop/Hive
Apache License 2.0
16 stars 6 forks source link

sqoop mappers not based on history #36

Closed BerryOsterlund closed 5 years ago

BerryOsterlund commented 5 years ago

Calculation of the number of sqoop mappers is done by looking at the previous sqoop import size and divide it with the hdfs_blocksize. But if the previous import failed and sqoop last size is reported to 0 byte, then the next import will only be executed with one mapper. This will most likely cause the next import to fail aswell if the source table is large.

Need to calculate the number of mappers based on the X number of previous imports, not just the last.

BerryOsterlund commented 5 years ago

The auto calculation of mappers is now based in the 'size' column in the 'import_statistics_last' table