shuijian-xu / hive

0 stars 0 forks source link

PROBLEMS WHEN USING RELATIONAL DATABASES #69

Open shuijian-xu opened 4 years ago

shuijian-xu commented 4 years ago

Problems Involving Time

Time variance is one of the most important characteristics of data warehouses. there appeared to be a certain amount of data redundancy in the warehouse because we were duplicating some of the information, for example, Customers' details, which existed in the operational systems. The reason we have to do this is because of the need to record information over time.

As an example, when a customer changes address we would expect that change to be recorded in the operational database. When we do that we lose the old address. So when a query is next executed where that customer's details are included, any sales of wine, for that customer, will automatically be attributed to the new address. If we are investigating sales by area, the results will be incorrect (assuming the customer moved to a different area) because many of the sales were made when the customer was in another area.

That's also the reason why we don't delete customers' details from the data warehouse simply because they are no longer customers. If they have placed any orders at all, then they have to remain within the system.