To enable the integration of the Pleiades dataset as LOD, I propose introducing a model-based solution within XRONOS. By implementing a matching method directly within either the Site model or the PleiadesItem model, we can establish an efficient way to link XRONOS site names with Pleiades data.
Proposed Enhancement:
1. Introduce a PleiadesItem Model
Add a model named PleiadesItem to store Pleiades site names and their corresponding IDs. This will provide the necessary structure for matching with the Site model.
Example model schema:
class PleiadesItem < ApplicationRecord
validates :name, presence: true
validates :pleiades_id, presence: true
end
2. Periodic Synchronization of Pleiades Data
Implement a rake task to populate and update the PleiadesItem model with the latest data from the Pleiades name_index.json. This ensures the data remains current.
Example task:
namespace :pleiades do
desc "Sync Pleiades data with XRONOS"
task sync: :environment do
require 'open-uri'
require 'json'
url = 'https://raw.githubusercontent.com/ryanfb/pleiades-geojson/gh-pages/name_index.json'
pleiades_data = JSON.parse(URI.open(url).read)
pleiades_data.each do |name, id|
PleiadesItem.find_or_create_by(name: name.strip.downcase, pleiades_id: id)
end
end
end
3. Match Logic in Model
a) Method in the Site Model
Add a method in the Site model to find a corresponding PleiadesItem for a site by comparing names:
Example:
class Site < ApplicationRecord
def match_to_pleiades
PleiadesItem.where("name ILIKE ?", "%#{self.name.strip.downcase}%").first
end
end
Usage:
site = Site.find(1)
match = site.match_to_pleiades
if match
puts "Site '#{site.name}' matches PleiadesItem '#{match.name}' with ID #{match.pleiades_id}"
else
puts "No match found for site '#{site.name}'"
end
b) Method in the PleiadesItem Model
Alternatively, add a method in the PleiadesItem model to find all Site entries matching a specific Pleiades name:
Example:
class PleiadesItem < ApplicationRecord
def match_sites
Site.where("name ILIKE ?", "%#{self.name.strip.downcase}%")
end
end
Usage:
pleiades_item = PleiadesItem.find(1)
matching_sites = pleiades_item.match_sites
matching_sites.each do |site|
puts "PleiadesItem '#{pleiades_item.name}' matches Site '#{site.name}'"
end
4. Further Task
Add fuzzy matching capabilities using gems like amatch or fuzzy_match to improve match accuracy.
Store match results by adding a pleiades_id field to a link table similar to the wikidata approach.
Benefits:
Embeds the matching logic directly into the relevant models, keeping the code clean and cohesive.
Supports efficient and reusable methods for site-to-Pleiades matching.
Potential Issues:
Matching accuracy depends on data quality and name standardization.
Requires maintenance of the PleiadesItem model and periodic synchronization.
Description:
To enable the integration of the Pleiades dataset as LOD, I propose introducing a model-based solution within XRONOS. By implementing a matching method directly within either the
Site
model or thePleiadesItem
model, we can establish an efficient way to link XRONOS site names with Pleiades data.Proposed Enhancement:
1. Introduce a
PleiadesItem
ModelAdd a model named
PleiadesItem
to store Pleiades site names and their corresponding IDs. This will provide the necessary structure for matching with theSite
model.Example model schema:
2. Periodic Synchronization of Pleiades Data
Implement a rake task to populate and update the PleiadesItem model with the latest data from the Pleiades name_index.json. This ensures the data remains current.
Example task:
3. Match Logic in Model
a) Method in the Site Model
Add a method in the Site model to find a corresponding PleiadesItem for a site by comparing names:
Example:
Usage:
b) Method in the PleiadesItem Model
Alternatively, add a method in the PleiadesItem model to find all Site entries matching a specific Pleiades name:
Example:
Usage:
4. Further Task
Benefits:
Potential Issues:
I look forward to feedback on this proposal!