MarcusBarnes / mik

The Move to Islandora Kit is an extensible PHP command-line tool for converting source content and metadata into packages suitable for importing into Islandora (or other digital repository and preservations systems).
GNU General Public License v3.0
34 stars 11 forks source link

checkconfig failure on null mappings #442

Closed bondjimbond closed 6 years ago

bondjimbond commented 6 years ago

After a git pull, I tried running MIK on a new ini file that included some null lines in the mapping file, which map to templates (which is normal practice as far as I know). When I run MIK checkconfig, I get the following error:

Error: Mapping snippet null0 appears to be not well formed

I tried running on older ini files that I know work, and got the same error. Any ideas what's happening?

mjordan commented 6 years ago

The validation happening at https://github.com/MarcusBarnes/mik/blob/master/src/config/Config.php#L135 is not taking into account that the right side of the null mapping (i.e, the second column) is not XML. I think all we need to do is add logic so that if the value begins with 'null' we should be ok.

mjordan commented 6 years ago

Looking at this a bit more closely, I see the specific problem. The current check will pass null mappings like this:

null0,<accessCondition>This resource is in the Public Domain.</accessCondition>
null1,<note type="additional physical form">Also available on microfiche.</note>
null3,<identifier type="uuid"/>

but will fail for the reason I mentioned in the case of the InsertXmlFromTemplate metadata manipulator (the one you hit this on), like this:

null4,null4

So to address this we'll need to add an additional context check for that manipulator.

bondjimbond commented 6 years ago

Thanks for looking into it, @mjordan. Was this check added/changed recently? I don't recall checkconfig failing on earlier mapping files with null values.

If you can advise on how I might go about adding such a check, I'd happily try my hand at it; or if you've got a branch for this already I'd be more than willing to test.

mjordan commented 6 years ago

@bondjimbond we added the InsertXmlFromTemplate metadata manipulator in Nov. 2016, so null0,null0-style mappings have probably been failing since then.

I'm not sure what the best approach is here. This type of mapping is specific to that manipulator, and, if used incorrectly where a null0,<xml></xml>-style mapping is the correct type to use (and vice versa), would pass validation. In fact, a user could have both types in the same mappings file at the same time. Not sure how to handle that.

Maybe manipulators should be doing their own mapping validation? In this case, if the InsertXmlFromTemplate metadata manipulator is registered, it would check the mapping file so see if its preferred mapping is present; if it is, the validation passes, if not, it fails. Just an idea.

mjordan commented 6 years ago

Maybe manipulators should be doing their own mapping validation?

Some lunchtime poking around... In this case, the InsertXmlFromTemplate metadata manipulator knows what row in the mapping file its mapping is (the mapping key is passed in as a parameter). If the manipulator implemented a method such as ->validateNullMapping(), it could pass back a TRUE of FALSE. In this case, the validation would simply be the the two parts of the mapping were the same. If it passes back a FALSE, we note that in the output to the user; if it passes back a TRUE, we remove that mapping's row from the existing check so it doesn't fail.

Before implementing something like this, I'd want to make sure that the pattern I'm describing would be applicable to other metadata manipulators. @MarcusBarnes any thoughts on this?

MarcusBarnes commented 6 years ago

@mjordan I'm happy with your suggested approach .

bondjimbond commented 6 years ago

As a simple approach, would it be possible to document the failure of checkconfig when null mappings are present, and provide an option (e.g. --ignore_null_mappings) that will skip them so we can still see any other config failures?

The need there is that the checkconfig fails when it reads the first null line, so you don't see any config problems further on in the process.

mjordan commented 6 years ago

@bondjimbond, documenting the predictable failure and at most adding a --ignore_null_mappings option is the way to go. A lot less complicated for what is a fairly specific edge case. @MarcusBarnes any thoughts?

bondjimbond commented 6 years ago

Cool. I don't know how that would work, but if someone writes it I can test.

mjordan commented 6 years ago

I'll open a PR adding the new option. If it works as intended you can merge as well 😄

mjordan commented 6 years ago

Question: should the new --ignore_null_mappings option ignore only null0,null0-style mappings only, or also null0,<accessCondition>This resource is in the Public Domain.</accessCondition>-style mappings? We may not want to ignore the latter since it's really easy to mess up the XML.

bondjimbond commented 6 years ago

@mjordan Wish I'd seen this before merging your PR... I'm not sure which approach you took. Yes, null mappings with XML should still be validated. Which way does yours work?

mjordan commented 6 years ago

It only ignores the null0,null0-style, not the other style. So we're good.

bondjimbond commented 6 years ago

Before this error was introduced, did checkconfig evaluate the validity of the manipulators? If so, it doesn't now; I tested by adding some bad code and bad XML in a Twig template and it wasn't caught.

mjordan commented 6 years ago

@bondjimbond no, the validity of the manipulators was never checked.