keeps / roda

RODA - Repository of Authentic Digital Objects
https://www.roda-community.org/
GNU Lesser General Public License v3.0
74 stars 48 forks source link

Plugin that assesses the risk of a file not being well characterised #642

Closed jmaferreira closed 7 years ago

jmaferreira commented 8 years ago

The plugin should detect the files that don't have a format identified and associate them to a risk. Then we should be able to run siegfried on those files to remove the risk.

rui-castro commented 7 years ago

@luis100 @hsilva-keep done in PR #702

NOTE: the risk information (description and notes) needs to be improved. I don't know how...

jmaferreira commented 7 years ago

We can fix the descriptions. No problem. I'm just wondering if the urn should have the exact name of the plugin in its ID for better tracking. What do you think @luis100

rui-castro commented 7 years ago

plugin class org.roda.core.plugins.riskManagement.FileNotCharacterizedRiskAssessmentPlugin

rui-castro commented 7 years ago

@luis100 @hsilva-keep @jmaferreira done in PR #715

luis100 commented 7 years ago
File is not comprehensively characterized.
Missing format designation: 1
Missing mimetype: 0
Missing PRONOM UID: 0

Tried the plugin and it stated is was missing format format designation but it wasn't.

report file

rui-castro commented 7 years ago

@luis100 I will clarify the output.

About the format designation missing, format designation is composed of 2 attributes (formatDesignationName and formatDesignationVersion), if one of these is missing the formatDesignation is also missing. I can make this more obvious changing the output message from "Missing format designation" to "Missing format designation (name and/or version)". Let me know...

rui-castro commented 7 years ago

@luis100 new commit with improved output.

screenshot from 2016-11-30 12 13 00

Still waiting for answer in the format designation text.

hsilva-keep commented 7 years ago

@rui-castro Great great would be if you, instead of creating lines of text in the report details, create a ValidationReport, which then can be added in HTML version to the details, which will make the things look nicer. Take https://github.com/keeps/roda/blob/hsilva_dev/roda-core/roda-core/src/main/java/org/roda/core/plugins/plugins/base/AIPCorruptionRiskAssessmentPlugin.java#L197 & https://github.com/keeps/roda/blob/hsilva_dev/roda-core/roda-core/src/main/java/org/roda/core/plugins/plugins/base/AIPCorruptionRiskAssessmentPlugin.java#L206 as example.

rui-castro commented 7 years ago

@hsilva-keep it's a good idea, but for a future issue.

luis100 commented 7 years ago
rui-castro commented 7 years ago

@luis100 new commit in PR #715 with the new parameter

luis100 commented 7 years ago
luis100 commented 7 years ago

Merged into lf_dev at 127f305e16bb6055b6eec53cb1c9d387d265883d, but missing the ignore directories.

rui-castro commented 7 years ago

@luis100 done in PR #736

luis100 commented 7 years ago

Counters are not being calculated properly. Must count a folder as a success.

formatnotcharacterizedcounters

rui-castro commented 7 years ago

Ok. I will fix it.

rui-castro commented 7 years ago

@luis100 done in PR #736

rui-castro commented 7 years ago

@luis100 ping

luis100 commented 7 years ago

@rui-castro pong

Checked and merged into lf_dev at c6fac937d3cde741640827e58b4da35b68f874c3

rui-castro commented 7 years ago

@luis100 :+1: