VijayQin / DMHY-spider

This project aims at crawling on DMHY, and store the html and torrent of each animation in our local file systems and in database(SQLite3). The future work will be filtering animation we want by the given rules and alert us to those updated every day.
14 stars 2 forks source link

Consistency between DMHY.db and Warehouse/ #2

Open fno2010 opened 8 years ago

fno2010 commented 8 years ago

When an update task is interrupted accidentally, the torrrent files will be still downloaded and stored into Warehouse/ but the items cannot be committed into DMHY.db.

VijayQin commented 8 years ago

I think you must have noticed that there is no exception dealing in the code up to now. Once the program encounter any exception, nothing will be commit to the database. That's because I want to guarantee the integrity of data. In other words, if one item had been inserted into database, database should have every item of that day. That's also why I am not suggesting user to use mode 4 (auto update). Mode 1 is more recommended.

However, warehouse is neither restrict to the integrity, nor used to data mining. It is just for user to download animation or view the update list not through database. Or when database broken down accidentally, it's still OK for user to view the list or download animation.

Therefore, neither integrity nor consistency of warehouse is required, all the thing it needed is availability.

Anyway, I think I should take exception dealing into consideration to offer us more debug detail, although I will not commit the incompleted task to database.