shuijian-xu / hive

0 stars 0 forks source link

staging area #44

Open shuijian-xu opened 5 years ago

shuijian-xu commented 5 years ago

In terms of the actual processing involved, it is quite common for the extraction process to store an intermediate version of the extracted data into what is known as a “staging area.” The benefit of this is that it enables us to easily rerun the loading process from raw data in the event of an upstream problem in the data warehouse load, without having to reextract it. Also, it enables the raw data to be backed up and archived easily, should we wish to do so. The extraction part of the tool must, therefore, allow for the writing out of intermediate flat files.