Closed gjost closed 7 years ago
Behavior confirmed:
(ddrlocal)ddr@denshodeb8:/usr/local/src/ddr-local/ddrlocal$ python manage.py shell
>>>
>>> from DDR import identifier
>>> e = identifier.Identifier(id='ddr-pc-33-15', base_path='/var/www/media/ddr').object()
>>> for topic in e.topics:
... print topic
...
{u'term': u'i', u'id': u'277'}
{u'term': u'Journalism and media: Community publications: Pacific Citizen', u'id': u'389'}
{u'term': u'Race and racism', u'id': u'36'}
{u'term': u'Race and racism: Cross-racial relations', u'id': u'38'}
{u'term': u'Race and racism: Discrimination', u'id': u'37'}
Note that this entity.json
is already damaged:
(ddrlocal)ddr@denshodeb8:/var/www/media/base/ddr-pc-33$ less files/ddr-pc-33-15/entity.json
...
{
"topics": [
{
"id": "277",
"term": "i"
},
{
"id": "389",
"term": "Journalism and media: Community publications: Pacific Citizen"
},
{
"id": "36",
"term": "Race and racism"
},
{
"id": "38",
"term": "Race and racism: Cross-racial relations"
},
{
"id": "37",
"term": "Race and racism: Discrimination"
}
]
},
...
On the plus side, the ID number is intact and writing the entity file doesn't seem to further damage the data:
(ddrlocal)ddr@denshodeb8:/usr/local/src/ddr-local/ddrlocal$ python manage.py shell
>>> from DDR import identifier
>>> e = identifier.Identifier(id='ddr-pc-33-15', base_path='/var/www/media/ddr').object()
>>> e.write_json()
>>>
(ddrlocal)ddr@denshodeb8:/var/www/media/base/ddr-pc-33$ git diff
diff --git a/files/ddr-pc-33-15/entity.json b/files/ddr-pc-33-15/entity.json
index 8316ca1..0171b0c 100644
--- a/files/ddr-pc-33-15/entity.json
+++ b/files/ddr-pc-33-15/entity.json
@@ -1,10 +1,10 @@
[
{
- "app_commit": "9d906ffdb5df85c59fd57034abcb424bb302202d (HEAD, origin/209-upgrade-elasticsearch, 209-upgrade-elasticsearch) 2017-01-30 17:45:05 -0800",
+ "app_commit": "00d6bf004a20c921f921fa5f28616ce642a51958 (HEAD, tag: v2.0, origin/master, origin/HEAD, master) 2017-05-03 11:27:32 -0700",
"app_release": "0.9.4-beta",
"application": "https://github.com/densho/ddr-cmdln.git",
- "git_version": "git version 2.1.4; git-annex version: 5.20141125\nbuild flags: Assistant Webapp Webapp-secure Pairing Testsuite S3 WebDAV Inotify DBus DesktopNotify XMPP DNS Feeds Quvi TDFA CryptoHash\nkey/value backends: SHA256E SHA1E SHA512E SHA224E SHA384E SKEIN256E SKEIN512E SHA256 SHA1 SHA512 SHA224 SHA384 SKEIN256 SKEIN512 WORM URL\nremote types: git gcrypt S3 bup directory rsync web webdav tahoe glacier ddar hook external\nlocal repository version: unknown\nsupported repository version: 5\nupgrade supported from repository versions: 0 1 2 4",
- "models_commit": "2106bb0a6c686e4258c0d9d02d1ced96c02f357f 2017-01-23 17:11:28 -0800"
+ "git_version": "git version 2.1.4; git-annex version: 5.20141125\nbuild flags: Assistant Webapp Webapp-secure Pairing Testsuite S3 WebDAV Inotify DBus DesktopNotify XMPP DNS Feeds Quvi TDFA CryptoHash\nkey/value backends: SHA256E SHA1E SHA512E SHA224E SHA384E SKEIN256E SKEIN512E SHA256 SHA1 SHA512 SHA224 SHA384 SKEIN256 SKEIN512 WORM URL\nremote types: git gcrypt S3 bup directory rsync web webdav tahoe glacier ddar hook external\nlocal repository version: 5\nsupported repository version: 5\nupgrade supported from repository versions: 0 1 2 4",
+ "models_commit": "8c5e0b200fe5f02c9216fd4bc3be42d46d881cf5 2017-02-01 14:36:59 -0800"
},
{
"id": "ddr-pc-33-15"
Detail from the most recent commit.
(ddrlocal)ddr@denshodeb8:/var/www/media/base/ddr-pc-33$ git log -n1 --format=full --patch files/ddr-pc-33-15/entity.json
commit 26b61ec199b3e3e9ffa189caa18a2c795f8756e9
Author: DDRAdmin <REDACTED@SERVER.ORG>
Commit: DDR Integration Manager <REDACTED@SERVER.ORG>
Manual commit after ddr-transform run
diff --git a/files/ddr-pc-33-15/entity.json b/files/ddr-pc-33-15/entity.json
index 66001e4..8316ca1 100644
--- a/files/ddr-pc-33-15/entity.json
+++ b/files/ddr-pc-33-15/entity.json
...
@@ -80,11 +84,26 @@
},
{
"topics": [
- "Geographic communities: Hawai'i [277]",
- "Journalism and media: Community publications: Pacific Citizen [389]",
- "Race and racism [36]",
- "Race and racism: Cross-racial relations [38]",
- "Race and racism: Discrimination [37]"
+ {
+ "id": "277",
+ "term": "i"
+ },
+ {
+ "id": "389",
+ "term": "Journalism and media: Community publications: Pacific Citizen"
+ },
+ {
+ "id": "36",
+ "term": "Race and racism"
+ },
+ {
+ "id": "38",
+ "term": "Race and racism: Cross-racial relations"
+ },
+ {
+ "id": "37",
+ "term": "Race and racism: Discrimination"
+ }
]
},
{
...
Topics appear fine in ddr-local
, including the topic in question. Topic titles used by TagManager are retrieved from the vocabs API, only the ID in the data is used, and the ID appears to be fine.
Created test object with ddrlocal
. Parsing error happens on both read and write.
...
{
"topics": [
{
"id": "241", "term": "Fiction"
},
{
"id": "277", "term": "i"
},
{
"id": "268", "term": "Pottery"
}
]
},
...
The topics
field in ddr-defs
(usr/local/src/ddr-local/ddr-defs/repo_models/entity.py
) is going through a bunch of different converter functions.
def jsonload_topics(text): return converters.text_to_bracketids(text, ['term','id'])
def display_topics( data ): return _display_multiline_dict('<a href="{{ data.id }}">{{ data.term }}</a>', data)
def formprep_topics(data): return converters.listofdicts_to_textnolabels(data, ['term','id'])
def formpost_topics(text): return converters.text_to_dicts(text, ['term', 'id'])
def csvload_topics( text ): return converters.text_to_listofdicts(text)
def csvdump_topics(data): return converters.listofdicts_to_text(data)
This is partially by design: jsonload_*
is supposed to ingest old formats but only save things in new/current format.
Regex in converters.py
thought the single-quote was a word boundary.
Fixed in b8a6c44 (not merged yet).