jajuk-team / jajuk

Advanded jukebox for users with large or scattered music collections
49 stars 19 forks source link

Safeguards against data corruption #1672

Open bflorat opened 9 years ago

bflorat commented 9 years ago

Reported by mats ahlgren on 2 Jun 2010 17:25 UTC In light of #1630 where collection data was silently lost, some of the following methods might help prevent such silent/critical dataloss in the future.

bflorat commented 9 years ago

Commented by bflorat on 9 Jun 2010 20:33 UTC On first two points : you may be right but it's simply not the way we designed jajuk and this change would be so deep that I can't imagine making it. For ie, today the jajuk data model is designed so a track (with title, album...) maps one or more audio files (only maps physical things like the file url). So in the physical view, a file's title comes from its track's data. If you use audio hashing, how to deal with tracks with identical audio content but different tags ?

On the third : why not if it's an option disable by default but I don't think it worths the code. Indeed, removing files from a collection is a very common operation and we don't want to harass users with this kind of controls.

On the fourth, idem : why not, but should be disable by default. The strange thing with it is its meta-feature side : it is a jajuk auto control but usually, programs have a premise stating that thy work ;-)

In all cases, as you figured, jajuk builds backup files so if you really loose a lot of data (and it is very unlikely, the #1630 bug conditions were very numerous and the bug unlikely to happen), you'll figure it quickly and restore a backup.

bflorat commented 9 years ago

Commented by mats.ahlgren@gmail.com on 12 Jun 2010 20:10 UTC

So in the physical view, a file's title comes from its track's data. If you use audio hashing, how to deal with tracks with identical audio content but different tags? If there is a one-to-one relationship between tracks and files, shouldn't the file path be an attribute of Track? And shouldn't the hash just be the file path? This would ensure the hash doesn't change when metadata changes; if the hash is the unique identifier, then it shouldn't depend on anything besides the file path right?

On the third : why not if it's an option disable by default but I don't think it worths the code. Indeed, removing files from a collection is a very common operation and we don't want to harass users with this kind of controls. Such a feature would unfortunately be useless if disabled by default. The point would be to have a safety-net. It is a technique called "representation-invariant checking", and if done properly it should not have any performance impact. Of course this would not harass the user when they were clicking "Remove File". It should never trigger unless the user moves files around in the file system (which Jajuk does not seem to support, nor warns the user about?), or unless the data is lost due to corruption.

On the fourth, idem : why not, but should be disable by default. The strange thing with it is its meta-feature side : it is a jajuk auto control but usually, programs have a premise stating that thy work ;-) Heheh, true. But such a feature would also be useless if disabled by default. The point would be to have a safety-net.

I'll mention any other possible ideas as I think of them, but these were the main ones.

sidenote: These steps would also help prevent other corruption issues such as #1395 , not just schema-update issues like #1630

bflorat commented 9 years ago

Commented by bflorat on 16 Jun 2010 21:17 UTC Replying to mats.ahlgren@:

So in the physical view, a file's title comes from its track's data. If you use audio hashing, how to deal with tracks with identical audio content but different tags? If there is a one-to-one relationship between tracks and files, shouldn't the file path be an attribute of Track? And shouldn't the hash just be the file path? This would ensure the hash doesn't change when metadata changes; if the hash is the unique identifier, then it shouldn't depend on anything besides the file path right?

It is a one to many cardinality between tracks and files (a track is the "logical" tag view of one or more audio files)

On the third : why not if it's an option disable by default but I don't think it worths the code. Indeed, removing files from a collection is a very common operation and we don't want to harass users with this kind of controls. Such a feature would unfortunately be useless if disabled by default. The point would be to have a safety-net. It is a technique called "representation-invariant checking", and if done properly it should not have any performance impact. Of course this would not harass the user when they were clicking "Remove File". It should never trigger unless the user moves files around in the file system (which Jajuk does not seem to support, nor warns the user about?), or unless the data is lost due to corruption.

OK, you're right, jajuk would only trigger an alert on changes performed out of a jajuk session.

On the fourth, idem : why not, but should be disable by default. The strange thing with it is its meta-feature side : it is a jajuk auto control but usually, programs have a premise stating that thy work ;-) Heheh, true. But such a feature would also be useless if disabled by default. The point would be to have a safety-net.

Indeed, it could be good idea but should be deepen to find actually significant cases of alert.