Open hugolpz opened 5 months ago
Lingualibre "External tool" query to external endpoints is ideal when we want to keep a joint with Wikidata items or lexemes. It allows easier later feedback contributions to wikidata, like reinjecting Lingualibre's audios into those correct wikidata items or lexemes pages.
I tested this query on your project :
PREFIX qwb: <https://qichwa.wikibase.cloud/entity/>
PREFIX qdp: <https://qichwa.wikibase.cloud/prop/direct/>
PREFIX qp: <https://qichwa.wikibase.cloud/prop/>
PREFIX qps: <https://qichwa.wikibase.cloud/prop/statement/>
PREFIX qpq: <https://qichwa.wikibase.cloud/prop/qualifier/>
PREFIX qpr: <https://qichwa.wikibase.cloud/prop/reference/>
PREFIX qno: <https://qichwa.wikibase.cloud/prop/novalue/>
select ?entry ?id ?idLabel ?posLabel
where {
?entry a ontolex:LexicalEntry;
wikibase:lemma ?idLabel;
wikibase:lexicalCategory [rdfs:label ?posLabel] . filter(lang(?posLabel)="en")
OPTIONAL {
?entry qdp:P1 ?wikidata.
BIND (iri(concat("http://www.wikidata.org/entity/",?wikidata)) as ?id)
}
}
Your project's data actually very rarely has a Wikidata id (P1), so there is curently no point to be using the external tool.
You can therefore equally create a non-jointed wikipage list (Telegram discussion > solution 2) :
Open https://lingualibre.org/wiki/List:Que/Elwin . Where `Que` is your language's iso 639-3.
Add by hand your 6,000 words, one word per line such as :
# word1
# word2
# word3
Save.
Message me, i will do some edit.
Then, open Lingualibre.org recording studio.
Step2: select Quechua
Step3: select "Local list" > search : List:Que/Elwin
1. Create 2 new properties on Lingualibre
- Property `Lexicographic external base` : `qichwa.wikibase.cloud`
- Property `Lexicographic external base ID` : `L2` (for https://qichwa.wikibase.cloud/wiki/Lexeme:L2 )
2. Finish to fix externaltool.js so it
2.1 pulls from qichwa :
- ?id = L2
- ?idLabel yaku
2.2 uploads .wav file to commons
2.3 records on lingualibre item :
- Wikimedia Commons recording pointer url ( https://commons.wikimedia.org/wiki/File:*.wav ) See example: https://lingualibre.org/wiki/Q191178#P3
- Lexicographic external base : qichwa.wikibase.cloud
- Lexicographic external base ID : L2
3. A Lingualibre <=> Qichwa joint now exists :
3.1 On Qichwa.wikibase.cloud, create property 'recording url pointer' on the model of https://lingualibre.org/wiki/Property:P3
3.2 Use a bot to read Lingualibre Qichwa items, then read
- ?id = P? `Lexicographic external base ID` value
- ?url P3 `recording url pointer`
3.3 Use bot to update qichwa.wikibase.cloud/wiki/Lexeme:{id}#{url}
but you have one year to do so.
Qichwa_wikibase_identifier
. Can refer to https://wikidata.org/wiki/Wikidata:Property_proposal/Lingua_Libre_IDQichwa_wikibase_identifier
Title | Pro | Con |
---|---|---|
👉🏼 hand made Lingualibre lists with no wikibases joint. | Pro: Fasted. | Con: Weakest joint. |
👉🏼 externaltool.js can be made compatible to pull ?id and ?idLabel from qichwa to generated list and jointed Lingualibre items. Delay: 2~4 weeks to get into prod. | Pro: Good joint. I'm available to do so if needed. Delay: 2~4 weeks to get into prod. |
Con: temporary solution, will need a bot to finish it up. |
👉🏼 Wikidata property creation for Qichwa_wikibase_id. | Pro: Good joint. | Con: Slowest. |
This nearly solve the issue. indexOfId
switch to clarify.
'use strict';
var PETSCAN_URL = 'petscan.wmflabs.org/',
WDQS_URL = 'query.wikidata.org/',
QICHWA_URL = 'qichwa.wikibase.cloud/query/sparql',
rw = mw.recordWizard;
var ExternalTools = function ( config ) {
rw.store.generator.generic.call( this, config );
};
OO.inheritClass( ExternalTools, rw.store.generator.generic );
// This line defines an internal name for the generator
ExternalTools.static.name = 'externaltools';
// And this one defines the name for the generator which will be displayed in the UI
ExternalTools.static.title = 'ExternalTools';
ExternalTools.prototype.initialize = function () {
// The two text fields
this.urlField = new OO.ui.TextInputWidget();
this.limitField = new OO.ui.NumberInputWidget( { min: 1, max: 2000, value: 500, step: 10, pageStep: 100, isInteger: true } );
// The custom layout
this.layout = new OO.ui.Widget( {
classes: [ 'mwe-recwiz-externaltools' ],
content: [
new OO.ui.FieldLayout( this.urlField, {
align: 'top',
label: 'ExternalTools URL (PetScan, Wikidata query service):'
} ),
new OO.ui.FieldLayout(
this.limitField, {
align: 'top',
label: mw.message( 'mwe-recwiz-nearby-limit' ).text()
}
)
]
} );
// To be displayed, all the fields/widgets/... should be appended to "this.content.$element"
this.content.$element.append( this.layout.$element );
// Do not remove this line, it will initialize the popup itself
rw.store.generator.generic.prototype.initialize.call( this );
};
ExternalTools.prototype.fetch = function () {
// Get the values of our text fields
var generator = this,
url = this.urlField.getValue();
this.limit = parseInt( this.limitField.getValue() );
/*
* TODO:
* - list of turnkey urls
*/
// Initialize a new promise
this.deferred = $.Deferred();
// Initialize our word list
this.list = [];
// Check if the given URL refers to an allowed external tool
var isPetscan = url.lastIndexOf( 'http://' + PETSCAN_URL, 0 ) === 0 || url.lastIndexOf( 'https://' + PETSCAN_URL, 0 ) === 0,
isWDQS = url.lastIndexOf( 'https://' + WDQS_URL, 0 ) === 0,
isQICHWA = url.lastIndexOf( 'https://' + QICHWA_URL, 0 ) === 0 ;
if ( isPetscan ) {
// We will do an AJAX request to petscan's API
$.get( url + '&output_compatability=quick-intersection&format=json&doit=' ).then( this.PetScan.bind( this ), function ( error ) { generator.deferred.reject( new OO.ui.Error( error ) ); } );
}
else if ( isWDQS ) {
// We will do an AJAX request to Wikidata Query Service
url = url.replace('https://query.wikidata.org/#', 'https://query.wikidata.org/sparql?query=') + '&format=json'
$.get( url ).then( this.WikidataQueryService.bind( this ), function ( error ) { generator.deferred.reject( new OO.ui.Error( error ) ); } );
}
else if ( isWDQS || isQICHWA ) {
// We will do an AJAX request to provided Query Service
url = url.replace(/(https:\/\/\w+.\w+.\w+)\/#/, "$1" + '/sparql?query=') + '&format=json';
$.get( url ).then( this.WikidataQueryService.bind( this ), function ( error ) { generator.deferred.reject( new OO.ui.Error( error ) ); } );
}
else {
this.deferred.reject( new OO.ui.Error( 'This is not an allowed URL... It should link to PetScan or Wikidata Query.' ) );
return this.deferred.promise();
}
this.lockUI();
// At this point we're not done yet, make the dialog closing process
// to wait the promise to be resolved or rejected
this.deferred.then( this.unlockUI.bind( this ), this.unlockUI.bind( this ) );
return this.deferred.promise();
};
ExternalTools.prototype.PetScan = function ( data ) {
var i, page, ns, element, property,
prefix = '',
project = mw.util.getParamValue( 'project', data.query ),
language = mw.util.getParamValue( 'language', data.query );
// Check whether the response looks fine or not
if ( data.status !== 'OK' ) {
this.deferred.reject( new OO.ui.Error( 'Petscan outputs something weird with this URL, check it and come back afterwards.' ) );
}
// For projects that have a custom property, select it
switch ( project ) {
case 'wikipedia':
property = 'P19';
prefix = language + ':';
break;
case 'wiktionary':
property = 'P20';
prefix = language + ':';
break;
}
// Parse the complete response (or at least until the limit is reached)
for ( i = 0; i < data.pages.length && i < this.limit; i++ ) {
page = data.pages[ i ];
element = { text: page.page_title.replace( /_/g, ' ' ) };
if ( property !== undefined ) {
ns = ( page.page_namespace !== 0 ? data.namespaces[ page.page_namespace ] : '' );
element[ property ] = prefix + ns + page.page_title;
}
this.list.push( element );
}
this.deferred.resolve();
};
ExternalTools.prototype.WikidataQueryService = function ( data ) {
var i, item, id, label, property, element;
// Check whether the response looks fine or not
if ( data.results === undefined ) {
this.deferred.reject( new OO.ui.Error( 'SPARQL Query Service outputs something weird with this URL, check it and come back afterwards.' ) );
return;
}
if ( data.results.bindings.length === 0 ) {
this.deferred.reject( new OO.ui.Error( 'No results in the request.' ) );
return;
}
if ( data.results.bindings[ 0 ].id === undefined || data.results.bindings[ 0 ].label === undefined ) {
this.deferred.reject( new OO.ui.Error( 'Result must contain both "id" and "label" field.' ) );
}
for( i=0; i < data.results.bindings.length; i++ ) {
item = data.results.bindings[ i ];
indexOfId = 31;
/*
On wikidata indexOfId = 31.
On qichwa indexOfId = 36 ???
Switch to exact position of ID to set
<https://www.wikidata.org/entity/L2>
<https://qichwa.wikibase.cloud/entity/L2>
*/
id = item.id.value.substring(indexOfId);
label = item.label.value;
switch( id[ 0 ] ) {
case 'L':
property = 'P21';
break;
default:
property = 'P12';
break;
}
element = { "text": label };
element[ property ] = id;
this.list.push( element );
}
this.deferred.resolve();
};
ExternalTools.prototype.lockUI = function () {
this.urlField.setDisabled( true );
this.limitField.setDisabled( true );
};
ExternalTools.prototype.unlockUI = function () {
this.urlField.setDisabled( false );
this.limitField.setDisabled( false );
this.getActions().get( { actions: 'save' } )[ 0 ].setDisabled( false );
};
rw.store.generator.register( 'externaltools', ExternalTools.static.title, 'll-externaltools', new ExternalTools() );
Hi @hugolpz, thanks for your support during this process, I really appreciate it!
I think you clarified all questions regarding what approaches to follow (GitHub). Now, I would like to propose to continue with the Solution 2 : medium strategy you proposed (which I understand is temporal):
Roadmap:
Questions aside: -- Is it possible to add users on RecordWizard with age ranges? --- Temp. Solution: add their age to the “Name to Display” value: “Ninfa_64”.(?)
Action Items: -- I am planning to organize workshops on how to record voices for the different variants on Qichwabase, so they can use LinguaLibre directly and pull lexemes/forms from Qichwabase.
Qichwa services
Lingualibre JS do edit
Approach
Edit the
ExternalTools.prototype.WikidataQueryService
into a more generalist function.Context
Lingualibre properties on Lingualibre items :
Test SPARQL
Test url