NCEAS / metacat

Data repository software that helps researchers preserve, share, and discover data
https://knb.ecoinformatics.org/software/metacat
GNU General Public License v2.0
27 stars 13 forks source link

dtd filenames clash if reused for multiple PUBLIC identifiers #272

Closed mbjones closed 5 years ago

mbjones commented 6 years ago

Author Name: Matt Jones (Matt Jones) Original Redmine Issue: 1452, https://projects.ecoinformatics.org/ecoinfo/issues/1452 Original Date: 2004-04-05 Original Assignee: Matt Jones


Problem reported by Rod Spears:

Ok, this is partially intended behavior. Metacat takes the following attitude towards establishing the relationship between a PUBLIC identifier/namespace and an associated DTD or schema:

1) When a document is submitted, check its PUBLIC id/namespace a) if it is not registered, then try to retrieve the DTD from either the passed in parameters, or from the provided SYSTEM identifier or from an xsi:schemaLocation. If schema is obtained, cache it and record its location and the public identifier. Fail with error if schema can't be obtained. b) if we already have it registered, look up the cached version of the schema and use it for validation, ignoring any data the user passes in.

This means that the first submitted docuemnt with a given type determines the DTD/schema used for validation for all subsequent documents submitted as that type. This allows an administrator to pre-register several document types that are important to him and be sure that any submitted documents are valid with respect to the schema he provided. Metacat ships with several pre-registered schemas and DTDs for EML.

So, your issue is this: the first time you registered the DTD, it uploaded the ecogridregistry.205.22.dtd file to metacat's dtd cache. Later, when you tried to upload a new document using a different public ID but the same system ID, it tried to save the file ecogridregistry.205.22.dtd but found that it already existed in the dtd cache, so it couldn't. This is a bug. There's no reason that we should use the identical filename as is passed in to us for the dtd filename, and so we should be gracefully renaming the DTD file when a name is already in use. This hasn't cropped up before because we haven't had people using the same DTD for different PUBLIC identifiers. You can work around it by simply renaming your DTD (to anything other than its current name) and then resubmitting. I'll file this as yet another bug -- yikes.

mbjones commented 6 years ago

Original Redmine Comment Author Name: Redmine Admin (Redmine Admin) Original Date: 2013-03-27T21:17:14Z


Original Bugzilla ID was 1452