Open IanMcCrabb opened 8 years ago
Made modifications to Com. Noun to common and Com. Adj. to common and Proper to proper
Latest term.sql
https://www.dropbox.com/sh/7118ipfh3jx2byk/AAA5VQPEUk1fEAQ7yuLStMS1a?dl=0
Removed Token-Certainty field
Latest term.sql
https://www.dropbox.com/sh/gydl62ho6g35wei/AAB6uCOYuSVnw0FOC3hPW5Z8a?dl=0
Replaced entity descriptions wotj short versions,
Latest term.sql
https://www.dropbox.com/sh/te2j2nah6zod8f0/AABJcpRsHAxfa85AxirfKKvLa?dl=0
still having generation problems.
(645,'en=>"GNoun"',642,776,'{,}',NULL,'Noun',NULL,NULL,'{23}',16,'{2}'), '{665,666}'
(647,'en=>"GAdjective"',642,776,'{, 676,677,678,679,680}',NULL,'Adjective',NULL,NULL,'{23}',16,'{2}'), '{675,676,677,678,679,680}'
Stephen A White Digital Humanities Software Engineer/Consultant email: stephenawhite57@gmail.com ph: +1 425 236 2995
On Fri, Apr 1, 2016 at 6:47 AM, IanMcCrabb notifications@github.com wrote:
Replaced entity descriptions wotj short versions,
Latest term.sql
https://www.dropbox.com/sh/te2j2nah6zod8f0/AABJcpRsHAxfa85AxirfKKvLa?dl=0
— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub https://github.com/stevewh/READ/issues/468#issuecomment-204247636
Modified parent of CustomType from AnnotationType to TagType
Have added CitationContaner and updated terms.sql. Up to date version is 👍
https://www.dropbox.com/s/hdh8xafk2ukikx7/term.sql?dl=0
Please note that the terms you have manually added are out by 2 . You have: (1385,'en=>"DictionaryCitation"',756,778,NULL,NULL,NULL,NULL,NULL,'{23}',16,'{2}'), (1386,'en=>"CitationContainer"',1385,778,NULL,NULL,NULL,NULL,NULL,'{23}',16,'{2}'),
I have: (1385,'en=>"DictionaryCitation"',735,778,NULL,NULL,NULL,NULL,NULL,'{23}',16,'{2}'),(1386,'en=>"CustomType"',8,778,NULL,NULL,NULL,NULL,NULL,'{23}',16,'{2}'), (1387,'en=>"common"',524,777,NULL,NULL,'Common Adjective',NULL,NULL,'{23}',16,'{2}'), (1388,'en=>"CitationContainer"',1385,778,NULL,NULL,NULL,NULL,NULL,'{23}',16,'{2}');
Checked my notes and I had indeed added 1386 and 1387 on 12 4
https://www.dropbox.com/s/rd358p45187n4pj/term.sql?dl=0
.
Steve, have you identified the terms that need to be added to support Palaeography? My notes form last synch were:
BaseType 1-10 FootMarkType 0 and 1-5 Vowel 1-10
I haven't identified the complete taxonomy. I talked with Stefan and he would like to defer to Andrew on this, so I intend to discuss this at our next synch. There may need to be a tag for choosing a selected group for display/publication. Beyond that I think we may be covered for version 1 since we have custom tags.
I'm pretty sure that we called the last one VowelType. Of course BaseType, FootMarkType and VowelType will need to be in some sort of Paleography classification hierarchy (PaleographicClass or Paleography). In thinking long term (next 3 months) we need to ensure that the RML is human readable, which would suggest that we error on the verbose, meaning that we likely need to BaseType contain Type1 - Type10 instead of 1-10 (though I'm not completely sold on this line of reason).
Stephen A White Digital Humanities Software Engineer/Consultant email: stephenawhite57@gmail.com ph: +1 425 236 2995
On Fri, Apr 29, 2016 at 2:00 AM, IanMcCrabb notifications@github.com wrote:
Steve, have you identified the terms that need to be added to support Palaeography? My notes form last synch were:
BaseType 1-10 FootMarkType 0 and 1-5 Vowel 1-10
— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub https://github.com/stevewh/READ/issues/468#issuecomment-215597316
Cheers Steven. Lets finalise next synch.
Have updated term.sql.
Modification to ParentTerm of Formulae in SequenceType to bring it in under Analysis hierarchy so that these sequences are empossed in loadTextEdtions.phs
From Steve 2/5
not sure how this is but, the current terms in the repo is different than your new version 544 has changed to 1387 (unclear why) 262 has changed to 1386 (unclear why) 1386 has changed to 1389 (this one needs to match)
current used by Andrew for migration db
(1384,'en=>"nf."',488,778,NULL,NULL,'Neuter/Female',NULL,NULL,'{23}',16,'{2}'), (1385,'en=>"DictionaryCitation"',756,778,NULL,NULL,NULL,NULL,NULL,'{23}',16,'{2}'), (1386,'en=>"CitationContainer"',1385,778,NULL,NULL,NULL,NULL,NULL,'{23}',16,'{2}'), (534,'en=>"common"',524,777,NULL,'Rather than have an additional SubPos (special case noun) this is identified by the selection of a multi-gender value for gender on the Lemma in which case the inflection UI for gender is enabled','Common Noun',NULL,NULL,'{23}',16,'{2}'), (535,'en=>"proper"',524,777,NULL,'Rather than have an additional SubPos (special case noun) this is identified by the selection of a multi-gender value for gender on the Lemma in which case the inflection UI for gender is enabled','Proper Noun',NULL,NULL,'{23}',16,'{2}'), (544,'en=>"common"',524,777,NULL,NULL,'Common Adjective',NULL,NULL,'{23}',16,'{2}'), (598,'en=>"common"',587,777,NULL,NULL,'Common Noun',NULL,NULL,'{23}',16,'{2}'), (608,'en=>"common"',587,777,NULL,NULL,'Common Adjective',NULL,NULL,'{23}',16,'{2}'), (675,'en=>"common"',674,778,NULL,NULL,'Common Adjective',NULL,NULL,'{23}',16,'{2}'), (753,'en=>"Donation"',751,778,'{752,753,754,972,1345}',NULL,NULL,NULL,NULL,'{23}',16,'{2}'), (754,'en=>"Honor"',751,778,'{752,753,754,972,1345}',NULL,NULL,NULL,NULL,'{23}',16,'{2}');
new
(1384,'en=>"nf."',488,778,NULL,NULL,'Neuter/Female',NULL,NULL,'{23}',16,'{2}'), (1385,'en=>"DictionaryCitation"',735,778,NULL,NULL,NULL,NULL,NULL,'{23}',16,'{2}'), (1386,'en=>"CustomType"',8,778,NULL,NULL,NULL,NULL,NULL,'{23}',16,'{2}'), (534,'en=>"common"',524,777,NULL,'Rather than have an additional SubPos (special case noun) this is identified by the selection of a multi-gender value for gender on the Lemma in which case the inflection UI for gender is enabled','Common Noun',NULL,NULL,'{23}',16,'{2}'), (535,'en=>"proper"',524,777,NULL,'Rather than have an additional SubPos (special case noun) this is identified by the selection of a multi-gender value for gender on the Lemma in which case the inflection UI for gender is enabled','Proper Noun',NULL,NULL,'{23}',16,'{2}'), (598,'en=>"common"',587,777,NULL,NULL,'Common Noun',NULL,NULL,'{23}',16,'{2}'), (608,'en=>"common"',587,777,NULL,NULL,'Common Adjective',NULL,NULL,'{23}',16,'{2}'), (675,'en=>"common"',674,778,NULL,NULL,'Common Adjective',NULL,NULL,'{23}',16,'{2}'), (753,'en=>"Donation"',751,778,'{752,753,754,972,1345}',NULL,NULL,NULL,NULL,'{23}',16,'{2}'), (754,'en=>"Honor"',751,778,'{752,753,754,972,1345}',NULL,NULL,NULL,NULL,'{23}',16,'{2}'), (1387,'en=>"common"',524,777,NULL,NULL,'Common Adjective',NULL,NULL,'{23}',16,'{2}'), (1388,'en=>"Formulae"',740,778,NULL,NULL,'Formuale',NULL,NULL,'{23}',16,'{2}'), (1389,'en=>"CitationContainer"',1385,778,NULL,NULL,NULL,NULL,NULL,'{23}',16,'{2}');
Steve, Oversight on my part. Neglected to flag these as being modified terms so they got new records which pushed 1386 out to 1389. Anyway, all sorted now.
Please use updated updated terms in dropbox
Oversight on my part. Neglected to flag 262 and 544 and DictionaryCitation as being modified terms so they got new records which pushed 1386 out to 1389. Anyway, all sorted now.
Updated terms to include all Paleography Terms
FYI (low priority), Just checked and there are 2 spellings paleography (US and more hits on Google and in word dictionary) and palaeography. Currently we have placed palaeography in the terms table. Need to test any type ahead and consider what we want to publish as the export functionality will look up the term and output it.
Stephen A White Digital Humanities Software Engineer/Consultant email: stephenawhite57@gmail.com ph: +1 425 236 2995
On Tue, May 3, 2016 at 9:46 AM, IanMcCrabb notifications@github.com wrote:
Updated terms to include all Paleography Terms
https://www.dropbox.com/s/byjesp9yaebb06n/term.sql?dl=0
— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub https://github.com/stevewh/READ/issues/468#issuecomment-216460482
FYI (low priority), Just checked and there are 2 spellings paleography (US and more hits on Google and in word dictionary) and palaeography.
Andrew used ‘paleography’ in his paleography. I use ‘paleography’ even though geographically in Europe.
Dr. Stefan Baums Institute for Indian and Tibetan Studies Ludwig Maximilian University of Munich
New terms has changed Formulae from (750,'en=>"Formulae"',735,778,NULL,NULL,'Formuale',NULL,NULL,'{23}',16,'{2}') to (1417,'en=>"Formulae"',740,778,NULL,NULL,'Formuale',NULL,NULL,'{23}',16,'{2}')
Stephen A White Digital Humanities Software Engineer/Consultant email: stephenawhite57@gmail.com ph: +1 425 236 2995
On Tue, May 3, 2016 at 11:09 AM, baums notifications@github.com wrote:
FYI (low priority), Just checked and there are 2 spellings paleography (US and more hits on Google and in word dictionary) and palaeography.
Andrew used ‘paleography’ in his paleography. I use ‘paleography’ even though geographically in Europe.
Dr. Stefan Baums Institute for Indian and Tibetan Studies Ludwig Maximilian University of Munich
— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub https://github.com/stevewh/READ/issues/468#issuecomment-216474348
Yeah, my bad. Had changed ParentType to Analysis and neglected to flag as modified. Have updated in master file so will come through on next drop.
Have changed spelling to paleography in master file. Won't bother generating another drop of terms until after we speak this coming synch and we confirm the Base, Footmark and Vowel types.
Have fixed error where parentage of DictionaryCitation and CitationContainer were reversed. Latest drop is
Push up latest
Added default as under Paleography. Removed RelicEstablishment terms from SequenceType and TagType
All formulae terms have been removed, all paleography terms have ,14,'{14}'), instead of ,16,'{2}'), for owner and visibility.
Stephen A White Digital Humanities Software Engineer/Consultant email: stephenawhite57@gmail.com ph: +1 425 236 2995
On Fri, May 20, 2016 at 4:44 AM, IanMcCrabb notifications@github.com wrote:
Added default as under Paleography
https://www.dropbox.com/s/pidr51sh8yoemc2/term.sql?dl=0
— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub https://github.com/stevewh/READ/issues/468#issuecomment-220504225
Indeed, removed formulae terms on purpose. Don't want conflicts with the set of terms I am building above 50000.
The 14 14 was an error from copy down. Have fixed this and will update terms in s couple of minutes.
This has a slight impact on kanishka test data that was faked up. We will need to consider replacing it at some time and currently I don't this it will impact. The easiest way to ensure this is to rebuild kanishkatest and any entity type id will error if it's not available.
Stephen A White Digital Humanities Software Engineer/Consultant email: stephenawhite57@gmail.com ph: +1 425 236 2995
On Fri, May 20, 2016 at 9:12 AM, IanMcCrabb notifications@github.com wrote:
Indeed, removed formulae terms on purpose. Don't want conflicts with the set of terms I am building above 50000.
The 14 14 was an error from copy down. Have fixed this and will update terms in s couple of minutes.
— You are receiving this because you were assigned. Reply to this email directly or view it on GitHub https://github.com/stevewh/READ/issues/468#issuecomment-220534100
Have updated Terms to address #544 and #543. Latest terms is in dropbox:
https://www.dropbox.com/s/a6qv81lya727aw9/term.sql?dl=0
Afraid I haven't had time to have the export routine adjusted for term sort. Next time ...
Have updated Terms to address #545. Latest terms is in dropbox:
Have updated Terms to address #545 . Latest terms is in dropbox:
Steve, can you push up latest code. Would like to ensure that I keep terms in synch.
Sure can. Need to synch all the code. There are still a couple of issues with this term update process.
Stephen A White Digital Humanities Software Engineer/Consultant email: stephenawhite57@gmail.com ph: +1 425 236 2995
On Wed, Jun 8, 2016 at 5:00 AM, IanMcCrabb notifications@github.com wrote:
Steve, can you push up latest code. Would like to ensure that I keep terms in synch.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/stevewh/READ/issues/468#issuecomment-224474934, or mute the thread https://github.com/notifications/unsubscribe/ACOHIUR8i1BzAz4yShgnM31Q6LHnuCgeks5qJjA6gaJpZM4H6PqG .
1-will code for ordering terms when I get some clear air next week 2-checked latest drop and 1224 is definitely still there! Last one in the file.
https://www.dropbox.com/s/wvq84q87clt3w90/term.sql?dl=0
(1224,'en=>"LanguageScript"',1215,806,'{721,722,723,1422}','A Term record with a constrained vocabulary derived from LanguageScript.','run_script_id','http://www.gandhari.org/kanishka/model/document/Run-LanguageScript.htm',NULL,'{23}',16,'{22}')
Code for ordering terms when I get some clear air next week
Do a review and dispense with redundant XxxType suffixes.
Have updated Terms to address #563. Latest terms is in dropbox:
https://www.dropbox.com/s/pww9yjgxemoh6tc/term.sql?dl=0
If you might assign this task back to me when you have actioneds as I need this as a reminder to get the ordering of terms coded this week and to review Terms with redundant Type suffixes.
Have updated Terms to address #625 and #618. Latest terms is in dropbox:
Have updated Terms to address #635 and #628 Latest terms is in dropbox:
Current terms removes and reassign several terms. This will need to be corrected for I can change.
I will debug this on my local machine using 001-005. Let me know if there are any other cases.
query for checking term useage in entity types for the database
select c.trm_id,concat(c.trmlabels::hstore->'en','',p.trm_labels::hstore->'en') as label_parentlabel,c.trm_code from term c left join term p on c.trm_parent_id = p.trm_id where c.trm_id in (select distinct(ano_type_id) from annotation union select distinct(img_type_id) from image union select distinct(prn_type_id) from propernoun union select distinct(ugr_type_id) from usergroup union select distinct(itm_type_id) from item union select distinct(prt_type_id) from part union select distinct(bln_type_id) from baseline union select distinct(gra_type_id) from grapheme union select distinct(lem_type_id) from lemma union select distinct(cat_type_id) from catalog union select distinct(edn_type_id) from edition union select distinct(seq_type_id) from sequence);
Someone should write a report query that gives us the usage counts by entity.
Hi Ian, Wrote a query that capture most of these. I had trm_type_id in there and removed it as it overwhelmed the results. I have not yet taken the steps to create the part that check the multiple type fields like txt_type_ids. Here is the query select c.trm_id,concat(c.trmlabels::hstore->'en','',p.trm_labels::hstore->'en') as label_parentlabel,c.trm_code from term c left join term p on c.trm_parent_id = p.trm_id where c.trm_id in (select distinct(ano_type_id) from annotation union select distinct(img_type_id) from image union select distinct(prn_type_id) from propernoun union select distinct(ugr_type_id) from usergroup union select distinct(itm_type_id) from item union select distinct(prt_type_id) from part union select distinct(bln_type_id) from baseline union select distinct(gra_type_id) from grapheme union select distinct(lem_type_id) from lemma union select distinct(cat_type_id) from catalog union select distinct(edn_type_id) from edition union select distinct(seq_type_id) from sequence); and the results for vp2sk.
trm_id | label_parentlabel | trm_code --------+---------------------------------------+----------- 266 | Comment_CommentaryType | 274 | FootNoteType_AnnotationType | 355 | Image_BaseLineType | 356 | Transcription_BaseLineType | 373 | Glossary_CatalogType | 476 | Published_EditionType | 503 | Consonant_GraphemeType | 504 | Vowel_GraphemeType | 505 | NumberSign_GraphemeType | 506 | Punctuation_GraphemeType | 507 | Unknown_GraphemeType | 508 | VowelModifier_GraphemeType | 509 | IntraSyllablePunctuation_GraphemeType | 511 | User_GroupType | 512 | Group_GroupType | 513 | System_GroupType | 515 | InscriptionRubbing_ImageType | 517 | InscriptionPhotograph_ImageType | 703 | Person_ProperNounType | 704 | Place_ProperNounType | 705 | School_ProperNounType | 706 | Dynasty_ProperNounType | 736 | TextPhysical_SequenceType | 737 | LinePhysical_TextPhysical | Line 738 | Text_SequenceType | 739 | TextDivision_Text | 740 | Analysis_SequenceType | 741 | Chapter_Analysis | Chapter 743 | Pada_Stanza | Pada 744 | Paragraph_Chapter | Paragraph 761 | Translation_TranslationType | 1389 | 1_BaseType | 1417 | Default_Paleography | 742 | Stanza_Chapter | Verse
Pulled down latest commit 25a3e347a899eef1966db25f3269d4d60cb60932 which includes your latest synched terms. Rebuild of test.db fails. Once we resolve this I'll work through the above issues.
fixed and pushed
Stephen A White Digital Humanities Software Engineer/Consultant email: stephenawhite57@gmail.com ph: +1 425 236 2995
On Sun, Jul 17, 2016 at 2:31 AM, IanMcCrabb notifications@github.com wrote:
Assigned #468 https://github.com/stevewh/READ/issues/468 to @stevewh https://github.com/stevewh.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stevewh/READ/issues/468#event-725324474, or mute the thread https://github.com/notifications/unsubscribe-auth/ACOHIeiNiX0Bv-7pDmQi1XiHyWkzUOKKks5qWXfagaJpZM4H6PqG .
Have pulled down your latest term list SHA-1 3e7688a77811f1192792d4f69a68875d175e7461 and worked through all our synch issues. Have generated a new term list.
https://www.dropbox.com/s/y87yfgpx6su4of4/term.sql?dl=0
We should now be up to date. Please note the following changes.
Following AnnotationType terms have been added: Creator License
Following AttributionType terms have been added: PrimaryEdition SecondaryEdition VisualDocumentation
Following LanguageScript terms have been added: pra-Brah pli-Brah pyx omx obx
SequenceType terms modified (same ID's) Verse > Stanza Pade > Pāda
TermType added (to support Measurement field: FK-PairMultiple_Semantic(Term)-Text
Type for Measurement field for in Item, Part, Fragment entities modified to the following: FK-PairMultiple_Semantic(Term)-Text
Following terms removed from Translation (will maintain above 50k) FluidTranslation LiteralTranslation
Have pulled down latest SHA-1 f6ec1e18a86676f9c5248f20ca9d61ba3c7b6568 and generated a new Term list to address following issues:
https://www.dropbox.com/s/l86pwl87ksqluuz/term.sql?dl=0
Removed following terms and retained ParentTerm only -628 en=>"ManufactureTechniqueOption1" -629 en=>"ManufactureTechniqueOption2" -634 en=>"ObjectMediumOption1" -635 en=>"ObjectMediumOption2" -637 en=>"ObjectShapeOption1" -638 en=>"ObjectShapeOption2" -640 en=>"ObjectTypeOption1" -641 en=>"ObjectTypeOption2"
TermType modified to Text-Multiple for following: -965 en=>"Location-LocationReferences" -1041 en=>"Location-LocationReferences"
Removed field from Surface entity -1283 en=>"Scripts"
Added to AttributionType: -Lemma
Looked at the terms and AnnotationType - TranslationType -(Translation | English | Chaya) where Translation has to remain 761 or we have to update th vp2sk database. I was under the impression that we wanted AnnotationType - Translation - (English | Chaya)
I'm unclear with the 31 7 generation of term where Lemma has been added.
Lemma was added as an Attribution type.
(1428,'en=>"Lemma"',339,778,NULL,NULL,NULL,NULL,NULL,'{23}',21,'{2}'),
Discussion about structure of AnnotationType - TranslationType - (English | Chaya) is in https://github.com/stevewh/READ/issues/545
Looks like what's happened is that I've miscontrued a later discussion we had. I had changed English to Translation rather than TranslationType to Translation.
So, have modified AnnotationType - Translation - (English | Chaya) . Latest drop is in
Have set up this task for ongoing maintenance to Terms. Please use this issue for any requirements for modifications to Terms. I will respond and generate a new term.sql. I will add the dropbox link to the new term.sql and assign to Steve to incorporate the new term list