Open ewlarson opened 1 week ago
Adding a Rails-ish reindexing task with a rescue to try and capture whatever is amiss here.
Seeing just 1 document error...
Processed 1000 documents in this batch, total processed: 82000
Processed 1000 documents in this batch, total processed: 83000
Processed 1000 documents in this batch, total processed: 84000
Processed 1000 documents in this batch, total processed: 85000
Processed 1000 documents in this batch, total processed: 86000
Error updating index for document: 0745f15d-b3e9-4a3d-aee7-4dfc47ff2a6e
undefined method `dct_references_uri_key' for an instance of Kithe::Asset
Processed 1000 documents in this batch, total processed: 87000
Processed 1000 documents in this batch, total processed: 88000
Processed 1000 documents in this batch, total processed: 89000
Processed 1000 documents in this batch, total processed: 90000
Processed 1000 documents in this batch, total processed: 91000
Processed 1000 documents in this batch, total processed: 92000
Processed 1000 documents in this batch, total processed: 93000
Processed 1000 documents in this batch, total processed: 94000
From rails console...
irb(main):004> d = Document.find_by_friendlier_id("0745f15d-b3e9-4a3d-aee7-4dfc47ff2a6e")
Document Load (2.2ms) SELECT "kithe_models".* FROM "kithe_models" WHERE "kithe_models"."type" = $1 AND "kithe_models"."friendlier_id" = $2 LIMIT $3 [["type", "Document"], ["friendlier_id", "0745f15d-b3e9-4a3d-aee7-4dfc47ff2a6e"], ["LIMIT", 1]]
=>
#<Document:0x000000013afb21c0
...
irb(main):005> d
=>
#<Document:0x000000013afb21c0
id: "d06ba0b9-53e6-4c3e-a5ad-518c0d01f558",
title: "Moral statistics [France] {1833}",
type: "Document",
position: nil,
json_attributes: "[FILTERED]",
created_at: Thu, 29 Feb 2024 08:44:17.000000000 CST -06:00,
updated_at: Fri, 01 Mar 2024 17:18:40.720658000 CST -06:00,
parent_id: nil,
friendlier_id: "0745f15d-b3e9-4a3d-aee7-4dfc47ff2a6e",
file_data: nil,
representative_id: "461ee342-dcf9-432e-b977-0f7dcce15085",
leaf_representative_id: "461ee342-dcf9-432e-b977-0f7dcce15085",
kithe_model_type: "work",
import_id: 112,
publication_state: "published",
dct_title_s: "Moral statistics [France] {1833}",
dct_alternative_sm: ["Guerry"],
dct_description_sm: ["Moral statistics of France (Guerry, 1833)"],
dct_language_sm: ["eng"],
gbl_displayNote_sm: [],
dct_creator_sm: [],
dct_publisher_sm: [],
schema_provider_s: "GeoDa Data and Lab",
gbl_resourceClass_sm: ["Datasets"],
gbl_resourceType_sm: [],
dct_subject_sm: [],
dcat_theme_sm: [],
dcat_keyword_sm: [],
dct_temporal_sm: ["1833"],
dct_issued_s: "",
gbl_indexYear_im: [1833],
gbl_dateRange_drsim: ["1833-1833"],
dct_spatial_sm: ["France"],
locn_geometry: "POLYGON((-5.45 51.31, 9.83 51.31, 9.83 41.26, -5.45 41.26, -5.45 51.31))",
dcat_bbox: "-5.45,41.26,9.83,51.31",
dcat_centroid: "46.285,2.19",
gbl_georeferenced_b: nil,
dct_relation_sm: [],
pcdm_memberOf_sm: ["b0153110-e455-4ced-9114-9b13250a7093"],
dct_isPartOf_sm: ["12d-05"],
dct_source_sm: [],
dct_isVersionOf_sm: [],
dct_replaces_sm: [],
dct_isReplacedBy_sm: [],
dct_rights_sm: [],
dct_rightsHolder_sm: [],
dct_license_sm: [],
dct_accessRights_s: "Public",
dct_format_s: "Shapefile",
gbl_fileSize_s: "",
b1g_creatorID_sm: [],
b1g_geonames_sm: [],
gbl_wxsIdentifier_s: "",
geomg_id_s: "0745f15d-b3e9-4a3d-aee7-4dfc47ff2a6e",
dct_identifier_sm: [],
gbl_suppressed_b: nil,
date_created_dtsi: Thu, 29 Feb 2024 08:44:17.000000000 CST -06:00,
date_modified_dtsi: nil,
b1g_language_sm: [],
b1g_image_ss: "",
b1g_code_s: "12d-05",
b1g_dct_accrualMethod_s: "Manual",
b1g_dct_accrualPeriodicity_s: "",
b1g_dateAccessioned_sm: ["2024-02-29"],
b1g_dateRetired_s: "",
b1g_status_s: "",
b1g_publication_state_s: "published",
b1g_child_record_b: nil,
b1g_dct_mediator_sm: [],
b1g_access_s: "",
dct_references_s:
[#<Document::Reference:0x000000013dd99b60
@attributes={"value"=>"https://geo.btaa.org/uploads/asset/461ee342-dcf9-432e-b977-0f7dcce15085/d7fed7dd22c9dbcba0fd8a296c79ae02.html", "category"=>"documentation_download"}>,
#<Document::Reference:0x000000013dd999a8 @attributes={"value"=>"https://geodacenter.github.io/data-and-lab/data/guerry.zip", "category"=>"download"}>,
irb(main):006> d.save
Document#references > seeded: {"http://lccn.loc.gov/sh85035852"=>["https://geo.btaa.org/uploads/asset/461ee342-dcf9-432e-b977-0f7dcce15085/d7fed7dd22c9dbcba0fd8a296c79ae02.html"], "http://schema.org/downloadUrl"=>["https://geodacenter.github.io/data-and-lab/data/guerry.zip"], "http://schema.org/url"=>["https://geodacenter.github.io/data-and-lab/Guerry/"]}
Document#dct_downloads > init: ["https://geodacenter.github.io/data-and-lab/data/guerry.zip"]
Document#multiple_downloads > aardvark: [{:label=>"Original Shapefile", :url=>"https://geodacenter.github.io/data-and-lab/data/guerry.zip"}]
TRANSACTION (0.4ms) BEGIN
DocumentDownload Load (10.7ms) SELECT "document_downloads".* FROM "document_downloads" WHERE "document_downloads"."friendlier_id" = $1 [["friendlier_id", "0745f15d-b3e9-4a3d-aee7-4dfc47ff2a6e"]]
Document#dct_downloads > document_downloads: [{:label=>"Original Shapefile", :url=>"https://geodacenter.github.io/data-and-lab/data/guerry.zip"}]
Kithe::Asset Load (0.7ms) SELECT "kithe_models"."id", "kithe_models"."title", "kithe_models"."type", "kithe_models"."position", "kithe_models"."json_attributes", "kithe_models"."created_at", "kithe_models"."updated_at", "kithe_models"."parent_id", "kithe_models"."friendlier_id", "kithe_models"."file_data", "kithe_models"."kithe_model_type", "kithe_models"."import_id", "kithe_models"."publication_state" FROM "kithe_models" WHERE "kithe_models"."type" IN ($1, $2) AND "kithe_models"."parent_id" = $3 [["type", "Kithe::Asset"], ["type", "Asset"], ["parent_id", "d06ba0b9-53e6-4c3e-a5ad-518c0d01f558"]]
Kithe::Model Load (0.8ms) SELECT "kithe_models".* FROM "kithe_models" WHERE "kithe_models"."id" = $1 [["id", "d06ba0b9-53e6-4c3e-a5ad-518c0d01f558"]]
TRANSACTION (0.2ms) ROLLBACK
/Users/ewlarson/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/activemodel-7.0.8.6/lib/active_model/attribute_methods.rb:450:in `method_missing': undefined method `dct_references_uri_key' for an instance of Kithe::Asset (NoMethodError)
irb(main):007> d.dct_references_s
=>
[#<Document::Reference:0x000000013dd99b60
@attributes={"value"=>"https://geo.btaa.org/uploads/asset/461ee342-dcf9-432e-b977-0f7dcce15085/d7fed7dd22c9dbcba0fd8a296c79ae02.html", "category"=>"documentation_download"}>,
#<Document::Reference:0x000000013dd999a8 @attributes={"value"=>"https://geodacenter.github.io/data-and-lab/data/guerry.zip", "category"=>"download"}>,
#<Document::Reference:0x000000013dd997f0 @attributes={"value"=>"https://geodacenter.github.io/data-and-lab/Guerry/", "category"=>"documentation_external"}>]
irb(main):008> d.assets
/Users/ewlarson/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/activemodel-7.0.8.6/lib/active_model/attribute_methods.rb:450:in `method_missing': undefined method `assets' for an instance of Document (NoMethodError)
Did you mean? asset!
asset?
irb(main):009> d.document_assets
Kithe::Asset Load (1.1ms) SELECT "kithe_models"."id", "kithe_models"."title", "kithe_models"."type", "kithe_models"."position", "kithe_models"."json_attributes", "kithe_models"."created_at", "kithe_models"."updated_at", "kithe_models"."parent_id", "kithe_models"."friendlier_id", "kithe_models"."file_data", "kithe_models"."kithe_model_type", "kithe_models"."import_id", "kithe_models"."publication_state" FROM "kithe_models" WHERE "kithe_models"."type" IN ($1, $2) AND "kithe_models"."parent_id" = $3 [["type", "Kithe::Asset"], ["type", "Asset"], ["parent_id", "d06ba0b9-53e6-4c3e-a5ad-518c0d01f558"]]
Kithe::Model Load (0.5ms) SELECT "kithe_models".* FROM "kithe_models" WHERE "kithe_models"."id" = $1 [["id", "d06ba0b9-53e6-4c3e-a5ad-518c0d01f558"]]
=>
[#<Kithe::Asset:0x000000013c698498
id: "461ee342-dcf9-432e-b977-0f7dcce15085",
title: "Guerry_documentation.html",
type: "Kithe::Asset",
position: 1,
json_attributes: nil,
created_at: Fri, 01 Mar 2024 13:56:27.378069000 CST -06:00,
updated_at: Fri, 01 Mar 2024 13:56:27.480131000 CST -06:00,
parent_id: "d06ba0b9-53e6-4c3e-a5ad-518c0d01f558",
friendlier_id: "hd9mhb9ky",
file_data:
{"id"=>"asset/461ee342-dcf9-432e-b977-0f7dcce15085/d7fed7dd22c9dbcba0fd8a296c79ae02.html",
"storage"=>"store",
"metadata"=>{"size"=>17577, "width"=>nil, "height"=>nil, "filename"=>"Guerry_documentation.html", "mime_type"=>"text/html"}},
kithe_model_type: "asset",
import_id: nil,
publication_state: "draft">]
Okay... so turns out we had one unexpected model type in the database — perhaps from before our DocumentAssets work was fully baked.
{"count"=>35534, "type"=>"Asset"}
{"count"=>106455, "type"=>"Document"}
{"count"=>1, "type"=>"Kithe::Asset"}
In our database would should only have Documents and Assets. The Kithe::Asset is technically the super class of our Assets model.
Removing the Kithe::Asset from the database resolves this issue.
Migrate/backport this rake task from GEOMG. But testing on production pgdump seeing this error: