SAA-SDT / eac-cpf-schema

https://eac.staatsbibliothek-berlin.de/
10 stars 4 forks source link

Website: recommendation for validator #36

Open SJagodzinski opened 7 years ago

SJagodzinski commented 7 years ago

Recommend tools for validation for EAC-CPF

Creator of issue

  1. Silke Jagodzinski
  2. TS-EAS: EAC-CPF subgroup
  3. s.jagodzinski@bundesarchiv.de or silkejagodzinski@gmail.com

The issue relates to

Wanted change/feature

We spent time on finding a validator that we could use to test the EAC-CPF xml as standard validators would not work. It would have been great if a specific or set of validators could have been recommended so that institutions could accurately validation their EAC-CPF xml data.

[K. Cox]

karinbredenberg commented 7 years ago

This sounds strange, the ordinary validators should work. Tests are need! Would say it is a schema issue.

SJagodzinski commented 7 years ago

I could go back to K. Cox and ask for her validator. Never had problems with xsd or rng files so far and never heard about that before.

karinbredenberg commented 7 years ago

Do that so we see what the problem are. Try to get as much info as you can.

Kirstycox commented 6 years ago

Hi Karin

Apologies for the tardy reply but we found issues when attempting to validate our EAC-CPF xml files as 'newbies' we struggled a bit.

We found that as the EAC-CPF xml uses a schema that our xml files did not validate using https://validator.w3.org/ (which of course is a more authoritative and well developed support community). Instead we used https://www.xmlvalidation.com/.

We guessed that this was because the w3 validator uses DTDs and not schemas? Is this correct.

So what we would like to know and also would be of some potential help to others adopting EAC-CPF if the EAC-CPF community recommends an authoritative schema validator. In retropect we should have asked for assistance from the EAD listserv community but we didn't.

Thanks, Kirsty

fordmadox commented 6 years ago

Hi, @Kirstycox : can you provide a sample EAC file?

The https://validator.w3.org validator will only work for HTML files, I believe, so you cannot validate EAC-CPF files with that tool.

I did test the https://www.xmlvalidation.com/ validation tool with an EAC file, and it appeared to work fine, as long as I checked the "Validate against external XML schema" button. I didn't do much digging into that, though, to determine how accurate the results are with that tool. The About page doesn't mention anything about how it operates, so I honestly don't know with just a quick look.

If you use a free text editor, however, there might be an option to do XML validation in that tool. For example, here's how to perform XSD validations with Notepad++: https://stackoverflow.com/a/30167471

Another free way to validate XML files would be to use a tool like Saxon-HE. https://www.saxonica.com/html/documentation/schema-processing/commandline.html . This would require a bit more effort to get started, since you'd need to download some extra files, but it would provide a host of benefits, such as the ability to perform validations in bulk.

I use oXygen XML Editor almost exclusively for working with XML, which makes validating files in bulk very easy.

It sounds like there would be a benefit to provide other examples of how to validate EAC files, however, so we'll definitely discuss that.

MicheleCombs commented 6 years ago

FYI, you can validate other kinds of XML files using https://validator.w3.org as long as the file has the right DOCTYPE declaration at the top and the validator can find the dtd to validate against. I’ve used it to validate EAD many many times. Unfortunately it doesn’t appear to work with schemas.

This one https://www.freeformatter.com/xml-validator-xsd.html will validate against schemas, though for some reason it seems to have trouble reading the EAC-CPF schema (http://eac.staatsbibliothek-berlin.de/schema/cpf2004v01.xsd ) , I’m not sure why.

So do all the other validators I’ve tried, in fact, including https://www.xmlvalidation.com/ . They none of them seem to “like” the schema as posted at that URL.

Ford, how did you get it to work?

Michele

From: fordmadox [mailto:notifications@github.com] Sent: Wednesday, January 17, 2018 11:53 AM To: SAA-SDT/eac-cpf-schema eac-cpf-schema@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [SAA-SDT/eac-cpf-schema] Website: recommendation for validator (#36)

Hi, @Kirstycoxhttps://github.com/kirstycox : can you provide a sample EAC file?

The https://validator.w3.org validator will only work for HTML files, I believe, so you cannot validate EAC-CPF files with that tool.

I did test the https://www.xmlvalidation.com/ validation tool with an EAC file, and it appeared to work fine, as long as I checked the "Validate against external XML schema" button. I didn't do much digging into that, though, to determine how accurate the results are with that tool. The About page doesn't mention what tools it's using to validate the files, so I honestly don't know with just a quick look.

If you use a free text editor, however, there might be an option to do XML validation in that tool. For example, here's how to perform XSD validations with Notepad++: https://stackoverflow.com/a/30167471

Another free way to validate XML files would be to use a tool like Saxon-HE. https://www.saxonica.com/html/documentation/schema-processing/commandline.html . This would require a bit more effort to get started, since you'd need to download some extra files, but it would provide a host of benefits, such as the ability to perform validations in bulk.

I use oXygen XML Editor almost exclusively for working with XML, which makes validating files in bulk very easy.

It sounds like there would be a benefit to provide other examples of how to validate EAC files, however, so we'll definitely discuss that.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/SAA-SDT/eac-cpf-schema/issues/36#issuecomment-358368166, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB1rY2qF9zHAnAfhW4gsYflUpBqkc29Nks5tLiVPgaJpZM4MviNj.

Kirstycox commented 6 years ago

Hi Michele,

Just tried to attach an EAC-CPF xml file but not a supported file type - If you wish I can email you directly or our entire EAC-CPF dataset is available for download at the following https://natlib.govt.nz/about-us/open-data/turnbull-names-metadata

Cheers, Kirsty

MicheleCombs commented 6 years ago

Hi Kirsty –

Thanks! We actually have access to sufficient EAC files for testing, so that’s not an issue. My question was really about first, the schema posted at the URL, and second, how Ford got the validator at the https://www.xmlvalidation.com/ to work.

Michele

fordmadox commented 6 years ago

@Kirstycox : I wanted to follow up with you to find out if you've found a satisfactory solution for validating EAC files. If not, have you tried the using a free text editor, such as https://notepad-plus-plus.org, and the solution suggested here: https://stackoverflow.com/questions/15436183/using-notepad-to-validate-xml-against-an-xsd/30167471#30167471 ?

Kirstycox commented 6 years ago

Hi there

Yes we use Notepad++ specifically to localise validation errors as our vendor and technology support team have developed automated validation tools for us.

However we found the best validator for EAC-CPF xml to be https://www.xmlvalidation.com/ - We raised this as a query as we thought some guidance regarding xml validation for EAC-CPF would be helpful to those new to EAC-CPF xml as from our experience as new users of this we found this a bit difficult.

Cheers, Kirsty

From: fordmadox [mailto:notifications@github.com] Sent: Tuesday, 27 March 2018 9:00 AM To: SAA-SDT/eac-cpf-schema Cc: Kirsty Cox; Mention Subject: Re: [SAA-SDT/eac-cpf-schema] Website: recommendation for validator (#36)

@Kirstycoxhttps://github.com/Kirstycox : I wanted to follow up with you to find out if you've found a satisfactory solution for validating EAC files. If not, have you tried the using a free text editor, such as https://notepad-plus-plus.org, and the solution suggested here: https://stackoverflow.com/questions/15436183/using-notepad-to-validate-xml-against-an-xsd/30167471#30167471 ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/SAA-SDT/eac-cpf-schema/issues/36#issuecomment-376292403, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AhB2vIBEiL3QHmi8xP0HXmKCHqsYLhzuks5tiUi_gaJpZM4MviNj.

MicheleCombs commented 6 years ago

Kirsty, could you let me know how you managed to validate using that site? I’ve tried a number of times and can’t get it to find/recognize the EAC-CPF schema.

Michele

From: Kirsty Cox notifications@github.com Sent: Monday, March 26, 2018 4:21 PM To: SAA-SDT/eac-cpf-schema eac-cpf-schema@noreply.github.com Cc: Michele R Combs mrrothen@syr.edu; Comment comment@noreply.github.com Subject: Re: [SAA-SDT/eac-cpf-schema] Website: recommendation for validator (#36)

Hi there

Yes we use Notepad++ specifically to localise validation errors as our vendor and technology support team have developed automated validation tools for us.

However we found the best validator for EAC-CPF xml to be https://www.xmlvalidation.com/ - We raised this as a query as we thought some guidance regarding xml validation for EAC-CPF would be helpful to those new to EAC-CPF xml as from our experience as new users of this we found this a bit difficult.

Cheers, Kirsty

From: fordmadox [mailto:notifications@github.com] Sent: Tuesday, 27 March 2018 9:00 AM To: SAA-SDT/eac-cpf-schema Cc: Kirsty Cox; Mention Subject: Re: [SAA-SDT/eac-cpf-schema] Website: recommendation for validator (#36)

@Kirstycoxhttps://github.com/Kirstycox : I wanted to follow up with you to find out if you've found a satisfactory solution for validating EAC files. If not, have you tried the using a free text editor, such as https://notepad-plus-plus.org, and the solution suggested here: https://stackoverflow.com/questions/15436183/using-notepad-to-validate-xml-against-an-xsd/30167471#30167471 ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/SAA-SDT/eac-cpf-schema/issues/36#issuecomment-376292403, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AhB2vIBEiL3QHmi8xP0HXmKCHqsYLhzuks5tiUi_gaJpZM4MviNj.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/SAA-SDT/eac-cpf-schema/issues/36#issuecomment-376298689, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB1rY3R-8mmrOf_iqDRUrqUxZOyVbEUhks5tiU2xgaJpZM4MviNj.

Kirstycox commented 5 years ago

Hi Michele

Apologies I have been on extended leave so only just getting through my backlog of emails.

Yes I guess that is the key issue – finding an xml validation tool that will recognise the EAC-CPF schema

Apologies I think we used https://validator.w3.org/ in the past for validation in 2016/2017.

The vendor (Axiell) of our Collections Management System (EMu) as a requirement does validation of EAC-CPF xml. We had a volunteer developer help us create our own validation tool to help with this (NOTE: a full export of all our EAC-CPF xml records are bundled into one dataset and monthly automatically goes through this validation tool): https://axrdo.gitlab.io/tiaki/

Kind regards,

Kirsty Cox Research Librarian, Digital Materials (Arrangement & Description) | Kaitiaki Pukapuka Mamati EMu Administrator | Kaiwhakarite Pātengi Raraunga Alexander Turnbull Library | National Library of New Zealand Te Puna Mātauranga o Aotearoa Direct Dial: +64 4 462 3958 | 70 Molesworth Street, Thorndon, Wellington PO Box 12349, Thorndon, Wellington 6144, New Zealand | www.natlib.govt.nzhttp://www.natlib.govt.nz/

[http://www.dia.govt.nz/diawebsite.nsf/Files/ATLlogo/$file/ATLlogo.png] National Library of New Zealand is part of the Department of Internal Affairs

[ATL%20031%20Email%20signature%20-%201]

From: MicheleCombs [mailto:notifications@github.com] Sent: Thursday, 29 March 2018 2:33 AM To: SAA-SDT/eac-cpf-schema Cc: Kirsty Cox; Mention Subject: Re: [SAA-SDT/eac-cpf-schema] Website: recommendation for validator (#36)

Kirsty, could you let me know how you managed to validate using that site? I’ve tried a number of times and can’t get it to find/recognize the EAC-CPF schema.

Michele

From: Kirsty Cox notifications@github.com Sent: Monday, March 26, 2018 4:21 PM To: SAA-SDT/eac-cpf-schema eac-cpf-schema@noreply.github.com Cc: Michele R Combs mrrothen@syr.edu; Comment comment@noreply.github.com Subject: Re: [SAA-SDT/eac-cpf-schema] Website: recommendation for validator (#36)

Hi there

Yes we use Notepad++ specifically to localise validation errors as our vendor and technology support team have developed automated validation tools for us.

However we found the best validator for EAC-CPF xml to be https://www.xmlvalidation.com/ - We raised this as a query as we thought some guidance regarding xml validation for EAC-CPF would be helpful to those new to EAC-CPF xml as from our experience as new users of this we found this a bit difficult.

Cheers, Kirsty

From: fordmadox [mailto:notifications@github.com] Sent: Tuesday, 27 March 2018 9:00 AM To: SAA-SDT/eac-cpf-schema Cc: Kirsty Cox; Mention Subject: Re: [SAA-SDT/eac-cpf-schema] Website: recommendation for validator (#36)

@Kirstycoxhttps://github.com/Kirstycox : I wanted to follow up with you to find out if you've found a satisfactory solution for validating EAC files. If not, have you tried the using a free text editor, such as https://notepad-plus-plus.org, and the solution suggested here: https://stackoverflow.com/questions/15436183/using-notepad-to-validate-xml-against-an-xsd/30167471#30167471 ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/SAA-SDT/eac-cpf-schema/issues/36#issuecomment-376292403, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AhB2vIBEiL3QHmi8xP0HXmKCHqsYLhzuks5tiUi_gaJpZM4MviNj.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/SAA-SDT/eac-cpf-schema/issues/36#issuecomment-376298689, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB1rY3R-8mmrOf_iqDRUrqUxZOyVbEUhks5tiU2xgaJpZM4MviNj.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/SAA-SDT/eac-cpf-schema/issues/36#issuecomment-376888066, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AhB2vDVUcpkBsdrs4YeVeKhXBZDyA9yRks5ti5EbgaJpZM4MviNj.