geonetwork / core-geonetwork

GeoNetwork is a catalog application to manage spatially referenced resources. It provides powerful metadata editing and search functions as well as an interactive web map viewer. It is currently used in numerous Spatial Data Infrastructure initiatives across the world.
http://geonetwork-opensource.org/
GNU General Public License v2.0
403 stars 481 forks source link

mef export should not use local url to thumbnail #2099

Open pvgenuchten opened 6 years ago

pvgenuchten commented 6 years ago

when a metadata is exported as mef/zip and thumbnails (and resources) are embedded in the mef, then the metadata should not point to the thumbnail on the location from where this metadata is exported. in stead it should reference just the filename, and when importing the metadata should be updated to point to the location on the importing node

the node from where the metadata is exported may not be available online (it's on localhost, or it may be on an intranet)

pvgenuchten commented 6 years ago

this issue exists for the example metadata on the default installer, all metadata there has references to localhost:8080, which is not available if people run geonetwork on any other url or port, which for example happens on osgeo live dvd which runs geonetwork on port 8880

PascalLike commented 6 years ago

I'm not sure what could be the best approach here.

How it works now: Export keeps the source image url. It's mandatory for external thumbnails, otherwise you would lose them. Import looks for thumbnails without url and assigns them to the portal where is happening the import. It keeps untouched the ones that already contains the URL, by considering them like external thumbnails.

The mix of these two behaviours causes the issue. No one is bad.

My first suggestion to fix the bug and the previous exports is to have an extreme approach on import, by sanitizing every url during the import. The problem are the external one, the only solution that I see here is to check if the file is in the zip, if it exists then fix the url otherwise keep it. The only issue could be when the MEF refers to external thumbnails that have an homonym in the import file.

My second suggestion to fix the bug is to have a rigid approach during the export, by removing any reference to the origin in the thumbnails. This approach works with the current import behavior. The main cons are the back compatibility issues. Starting from the demo data of GN.

pvgenuchten commented 6 years ago

I prefer the first approach (and i'm not so worried about the homonyms), so see what files are in the mef and replace any url's pointing to them with the new url

Note that the same use case applies to files (resources like shape, csv, doc, pdf) being included in the mef. When metadata is exported from an intranet catalog to an extranet catalog, the links to the files should be updated to match the url of the external catalogue.

PascalLike commented 6 years ago

2255