own-pt / openWordnet-PT

OpenWordnet-PT: an open access wordnet for Portuguese
http://openwordnet-pt.org
Other
154 stars 35 forks source link

own-pt files #172

Closed fredsonaguiar closed 3 years ago

fredsonaguiar commented 3 years ago

Desmembramos a own-pt, descrita em own-pt-morpho e own-en-morpho, em arquivos menores, agora em formato .ttl. Os novos arquivos estão disponíveis em own-pt-files.

Cada arquivo produzido modela um especto da own-pt que pode ou não ser necessário a depender da tarefa:

arademaker commented 3 years ago

Só não ficou claro onde estão as relações entre synsets e wordsenses. Vc disse que estariam no relations mas também no wordsense-words

arademaker commented 3 years ago

Peço também documentar código e comando usado para gerar os arquivos

fredsonaguiar commented 3 years ago

Só não ficou claro onde estão as relações entre synsets e wordsenses. Vc disse que estariam no relations mas também no wordsense-words

De fato, a frase é ambígua. Entenda o arquivo *-relations como contendo as relações provenientes da PWN, não as relações extruturais/construtivas que descrevem a topologia básica da OWN-PT.

Relações como own30:containsWordSense e wn30:word estão junto às definições dos nós WordSense e Word, assim como as relações nomlex:verb e nomlex:noum estão junto às definições de nomlex:Nominalization.

fredsonaguiar commented 3 years ago

Para a geração dos arquivos, precisamos projetar as relações de PWN, e apenas então fazemos a separação. Note que projetamos as relações entre synsets apenas, uma vez que não existe mapeamento entre senses.

A projeção da relações ocorre através deste script. A separação ocorre com base nos predicados, tomando vantagem do formato .nt. Dessa forma escolhemos precisamente quais predicações estão em cada arquivo de saída:

Projeção:

python3 pyownpt/cli/pwn_rel_projection.py own-pt-morpho.nt own-en-morpho.nt -o own-pt-projected.nt -v

Separamos own-pt-projected.nt, saída do script anterior:

cat own-pt-projected.nt | grep "https://w3id.org/own-pt/nomlex" > own-pt-split/own-pt-morphosemantic-links.nt
cat own-pt-projected.nt | grep "https://w3id.org/own-pt/nomlex" -v | grep "sameAs" > own-pt-split/own-pt-same-as.nt
cat own-pt-projected.nt | grep "https://w3id.org/own-pt/nomlex" -v | grep "sameAs" -v | egrep "(https://w3id.org/own-pt/wn30/schema/adjectivePertainsTo|https://w3id.org/own-pt/wn30/schema/adverbPertainsTo|https://w3id.org/own-pt/wn30/schema/antonymOf|https://w3id.org/own-pt/wn30/schema/attribute|https://w3id.org/own-pt/wn30/schema/causes|https://w3id.org/own-pt/wn30/schema/classifiedByRegion|https://w3id.org/own-pt/wn30/schema/classifiedByTopic|https://w3id.org/own-pt/wn30/schema/classifiedByUsage|https://w3id.org/own-pt/wn30/schema/classifiesByRegion|https://w3id.org/own-pt/wn30/schema/classifiesByTopic|https://w3id.org/own-pt/wn30/schema/classifiesByUsage|https://w3id.org/own-pt/wn30/schema/derivationallyRelated|https://w3id.org/own-pt/wn30/schema/entails|https://w3id.org/own-pt/wn30/schema/hasInstance|https://w3id.org/own-pt/wn30/schema/hypernymOf|https://w3id.org/own-pt/wn30/schema/hyponymOf|https://w3id.org/own-pt/wn30/schema/instanceOf|https://w3id.org/own-pt/wn30/schema/similarTo|https://w3id.org/own-pt/wn30/schema/substanceHolonymOf|https://w3id.org/own-pt/wn30/schema/substanceMeronymOf|https://w3id.org/own-pt/wn30/schema/memberHolonymOf|https://w3id.org/own-pt/wn30/schema/memberMeronymOf|https://w3id.org/own-pt/wn30/schema/partHolonymOf|https://w3id.org/own-pt/wn30/schema/participleOf|https://w3id.org/own-pt/wn30/schema/partMeronymOf|https://w3id.org/own-pt/wn30/schema/sameVerbGroupAs|https://w3id.org/own-pt/wn30/schema/seeAlso)" > own-pt-split/own-pt-relations.nt
cat own-pt-projected.nt | grep "https://w3id.org/own-pt/nomlex" -v | grep "sameAs" -v | egrep "(https://w3id.org/own-pt/wn30/schema/adjectivePertainsTo|https://w3id.org/own-pt/wn30/schema/adverbPertainsTo|https://w3id.org/own-pt/wn30/schema/antonymOf|https://w3id.org/own-pt/wn30/schema/attribute|https://w3id.org/own-pt/wn30/schema/causes|https://w3id.org/own-pt/wn30/schema/classifiedByRegion|https://w3id.org/own-pt/wn30/schema/classifiedByTopic|https://w3id.org/own-pt/wn30/schema/classifiedByUsage|https://w3id.org/own-pt/wn30/schema/classifiesByRegion|https://w3id.org/own-pt/wn30/schema/classifiesByTopic|https://w3id.org/own-pt/wn30/schema/classifiesByUsage|https://w3id.org/own-pt/wn30/schema/derivationallyRelated|https://w3id.org/own-pt/wn30/schema/entails|https://w3id.org/own-pt/wn30/schema/hasInstance|https://w3id.org/own-pt/wn30/schema/hypernymOf|https://w3id.org/own-pt/wn30/schema/hyponymOf|https://w3id.org/own-pt/wn30/schema/instanceOf|https://w3id.org/own-pt/wn30/schema/similarTo|https://w3id.org/own-pt/wn30/schema/substanceHolonymOf|https://w3id.org/own-pt/wn30/schema/substanceMeronymOf|https://w3id.org/own-pt/wn30/schema/memberHolonymOf|https://w3id.org/own-pt/wn30/schema/memberMeronymOf|https://w3id.org/own-pt/wn30/schema/partHolonymOf|https://w3id.org/own-pt/wn30/schema/participleOf|https://w3id.org/own-pt/wn30/schema/partMeronymOf|https://w3id.org/own-pt/wn30/schema/sameVerbGroupAs|https://w3id.org/own-pt/wn30/schema/seeAlso)" -v | egrep "/word-" > own-pt-split/own-pt-words.nt
cat own-pt-projected.nt | grep "https://w3id.org/own-pt/nomlex" -v | grep "sameAs" -v | egrep "(https://w3id.org/own-pt/wn30/schema/adjectivePertainsTo|https://w3id.org/own-pt/wn30/schema/adverbPertainsTo|https://w3id.org/own-pt/wn30/schema/antonymOf|https://w3id.org/own-pt/wn30/schema/attribute|https://w3id.org/own-pt/wn30/schema/causes|https://w3id.org/own-pt/wn30/schema/classifiedByRegion|https://w3id.org/own-pt/wn30/schema/classifiedByTopic|https://w3id.org/own-pt/wn30/schema/classifiedByUsage|https://w3id.org/own-pt/wn30/schema/classifiesByRegion|https://w3id.org/own-pt/wn30/schema/classifiesByTopic|https://w3id.org/own-pt/wn30/schema/classifiesByUsage|https://w3id.org/own-pt/wn30/schema/derivationallyRelated|https://w3id.org/own-pt/wn30/schema/entails|https://w3id.org/own-pt/wn30/schema/hasInstance|https://w3id.org/own-pt/wn30/schema/hypernymOf|https://w3id.org/own-pt/wn30/schema/hyponymOf|https://w3id.org/own-pt/wn30/schema/instanceOf|https://w3id.org/own-pt/wn30/schema/similarTo|https://w3id.org/own-pt/wn30/schema/substanceHolonymOf|https://w3id.org/own-pt/wn30/schema/substanceMeronymOf|https://w3id.org/own-pt/wn30/schema/memberHolonymOf|https://w3id.org/own-pt/wn30/schema/memberMeronymOf|https://w3id.org/own-pt/wn30/schema/partHolonymOf|https://w3id.org/own-pt/wn30/schema/participleOf|https://w3id.org/own-pt/wn30/schema/partMeronymOf|https://w3id.org/own-pt/wn30/schema/sameVerbGroupAs|https://w3id.org/own-pt/wn30/schema/seeAlso)" -v | egrep "/word-" -v | egrep "/wordsense-" > own-pt-split/own-pt-wordsenses.nt
cat own-pt-projected.nt | grep "https://w3id.org/own-pt/nomlex" -v | grep "sameAs" -v | egrep "(https://w3id.org/own-pt/wn30/schema/adjectivePertainsTo|https://w3id.org/own-pt/wn30/schema/adverbPertainsTo|https://w3id.org/own-pt/wn30/schema/antonymOf|https://w3id.org/own-pt/wn30/schema/attribute|https://w3id.org/own-pt/wn30/schema/causes|https://w3id.org/own-pt/wn30/schema/classifiedByRegion|https://w3id.org/own-pt/wn30/schema/classifiedByTopic|https://w3id.org/own-pt/wn30/schema/classifiedByUsage|https://w3id.org/own-pt/wn30/schema/classifiesByRegion|https://w3id.org/own-pt/wn30/schema/classifiesByTopic|https://w3id.org/own-pt/wn30/schema/classifiesByUsage|https://w3id.org/own-pt/wn30/schema/derivationallyRelated|https://w3id.org/own-pt/wn30/schema/entails|https://w3id.org/own-pt/wn30/schema/hasInstance|https://w3id.org/own-pt/wn30/schema/hypernymOf|https://w3id.org/own-pt/wn30/schema/hyponymOf|https://w3id.org/own-pt/wn30/schema/instanceOf|https://w3id.org/own-pt/wn30/schema/similarTo|https://w3id.org/own-pt/wn30/schema/substanceHolonymOf|https://w3id.org/own-pt/wn30/schema/substanceMeronymOf|https://w3id.org/own-pt/wn30/schema/memberHolonymOf|https://w3id.org/own-pt/wn30/schema/memberMeronymOf|https://w3id.org/own-pt/wn30/schema/partHolonymOf|https://w3id.org/own-pt/wn30/schema/participleOf|https://w3id.org/own-pt/wn30/schema/partMeronymOf|https://w3id.org/own-pt/wn30/schema/sameVerbGroupAs|https://w3id.org/own-pt/wn30/schema/seeAlso)" -v | egrep "/word-" -v | egrep "/wordsense-" -v > own-pt-split/own-pt-synsets.nt

Separamos own-en-morpho:

cat own-en-morpho.nt | grep "https://w3id.org/own-pt/nomlex" > own-pt-split/own-en-morphosemantic-links.nt
cat own-en-morpho.nt | grep "https://w3id.org/own-pt/nomlex" -v | egrep "(https://w3id.org/own-pt/wn30/schema/adjectivePertainsTo|https://w3id.org/own-pt/wn30/schema/adverbPertainsTo|https://w3id.org/own-pt/wn30/schema/antonymOf|https://w3id.org/own-pt/wn30/schema/attribute|https://w3id.org/own-pt/wn30/schema/causes|https://w3id.org/own-pt/wn30/schema/classifiedByRegion|https://w3id.org/own-pt/wn30/schema/classifiedByTopic|https://w3id.org/own-pt/wn30/schema/classifiedByUsage|https://w3id.org/own-pt/wn30/schema/classifiesByRegion|https://w3id.org/own-pt/wn30/schema/classifiesByTopic|https://w3id.org/own-pt/wn30/schema/classifiesByUsage|https://w3id.org/own-pt/wn30/schema/derivationallyRelated|https://w3id.org/own-pt/wn30/schema/entails|https://w3id.org/own-pt/wn30/schema/hasInstance|https://w3id.org/own-pt/wn30/schema/hypernymOf|https://w3id.org/own-pt/wn30/schema/hyponymOf|https://w3id.org/own-pt/wn30/schema/instanceOf|https://w3id.org/own-pt/wn30/schema/similarTo|https://w3id.org/own-pt/wn30/schema/substanceHolonymOf|https://w3id.org/own-pt/wn30/schema/substanceMeronymOf|https://w3id.org/own-pt/wn30/schema/memberHolonymOf|https://w3id.org/own-pt/wn30/schema/memberMeronymOf|https://w3id.org/own-pt/wn30/schema/partHolonymOf|https://w3id.org/own-pt/wn30/schema/participleOf|https://w3id.org/own-pt/wn30/schema/partMeronymOf|https://w3id.org/own-pt/wn30/schema/sameVerbGroupAs|https://w3id.org/own-pt/wn30/schema/seeAlso)" > own-pt-split/own-en-relations.nt
cat own-en-morpho.nt | grep "https://w3id.org/own-pt/nomlex" -v | egrep "(https://w3id.org/own-pt/wn30/schema/adjectivePertainsTo|https://w3id.org/own-pt/wn30/schema/adverbPertainsTo|https://w3id.org/own-pt/wn30/schema/antonymOf|https://w3id.org/own-pt/wn30/schema/attribute|https://w3id.org/own-pt/wn30/schema/causes|https://w3id.org/own-pt/wn30/schema/classifiedByRegion|https://w3id.org/own-pt/wn30/schema/classifiedByTopic|https://w3id.org/own-pt/wn30/schema/classifiedByUsage|https://w3id.org/own-pt/wn30/schema/classifiesByRegion|https://w3id.org/own-pt/wn30/schema/classifiesByTopic|https://w3id.org/own-pt/wn30/schema/classifiesByUsage|https://w3id.org/own-pt/wn30/schema/derivationallyRelated|https://w3id.org/own-pt/wn30/schema/entails|https://w3id.org/own-pt/wn30/schema/hasInstance|https://w3id.org/own-pt/wn30/schema/hypernymOf|https://w3id.org/own-pt/wn30/schema/hyponymOf|https://w3id.org/own-pt/wn30/schema/instanceOf|https://w3id.org/own-pt/wn30/schema/similarTo|https://w3id.org/own-pt/wn30/schema/substanceHolonymOf|https://w3id.org/own-pt/wn30/schema/substanceMeronymOf|https://w3id.org/own-pt/wn30/schema/memberHolonymOf|https://w3id.org/own-pt/wn30/schema/memberMeronymOf|https://w3id.org/own-pt/wn30/schema/partHolonymOf|https://w3id.org/own-pt/wn30/schema/participleOf|https://w3id.org/own-pt/wn30/schema/partMeronymOf|https://w3id.org/own-pt/wn30/schema/sameVerbGroupAs|https://w3id.org/own-pt/wn30/schema/seeAlso)" -v | egrep "/word-" > own-pt-split/own-en-words.nt
cat own-en-morpho.nt | grep "https://w3id.org/own-pt/nomlex" -v | egrep "(https://w3id.org/own-pt/wn30/schema/adjectivePertainsTo|https://w3id.org/own-pt/wn30/schema/adverbPertainsTo|https://w3id.org/own-pt/wn30/schema/antonymOf|https://w3id.org/own-pt/wn30/schema/attribute|https://w3id.org/own-pt/wn30/schema/causes|https://w3id.org/own-pt/wn30/schema/classifiedByRegion|https://w3id.org/own-pt/wn30/schema/classifiedByTopic|https://w3id.org/own-pt/wn30/schema/classifiedByUsage|https://w3id.org/own-pt/wn30/schema/classifiesByRegion|https://w3id.org/own-pt/wn30/schema/classifiesByTopic|https://w3id.org/own-pt/wn30/schema/classifiesByUsage|https://w3id.org/own-pt/wn30/schema/derivationallyRelated|https://w3id.org/own-pt/wn30/schema/entails|https://w3id.org/own-pt/wn30/schema/hasInstance|https://w3id.org/own-pt/wn30/schema/hypernymOf|https://w3id.org/own-pt/wn30/schema/hyponymOf|https://w3id.org/own-pt/wn30/schema/instanceOf|https://w3id.org/own-pt/wn30/schema/similarTo|https://w3id.org/own-pt/wn30/schema/substanceHolonymOf|https://w3id.org/own-pt/wn30/schema/substanceMeronymOf|https://w3id.org/own-pt/wn30/schema/memberHolonymOf|https://w3id.org/own-pt/wn30/schema/memberMeronymOf|https://w3id.org/own-pt/wn30/schema/partHolonymOf|https://w3id.org/own-pt/wn30/schema/participleOf|https://w3id.org/own-pt/wn30/schema/partMeronymOf|https://w3id.org/own-pt/wn30/schema/sameVerbGroupAs|https://w3id.org/own-pt/wn30/schema/seeAlso)" -v | egrep "/word-" -v | egrep "/wordsense-" > own-pt-split/own-en-wordsenses.nt
cat own-en-morpho.nt | grep "https://w3id.org/own-pt/nomlex" -v | egrep "(https://w3id.org/own-pt/wn30/schema/adjectivePertainsTo|https://w3id.org/own-pt/wn30/schema/adverbPertainsTo|https://w3id.org/own-pt/wn30/schema/antonymOf|https://w3id.org/own-pt/wn30/schema/attribute|https://w3id.org/own-pt/wn30/schema/causes|https://w3id.org/own-pt/wn30/schema/classifiedByRegion|https://w3id.org/own-pt/wn30/schema/classifiedByTopic|https://w3id.org/own-pt/wn30/schema/classifiedByUsage|https://w3id.org/own-pt/wn30/schema/classifiesByRegion|https://w3id.org/own-pt/wn30/schema/classifiesByTopic|https://w3id.org/own-pt/wn30/schema/classifiesByUsage|https://w3id.org/own-pt/wn30/schema/derivationallyRelated|https://w3id.org/own-pt/wn30/schema/entails|https://w3id.org/own-pt/wn30/schema/hasInstance|https://w3id.org/own-pt/wn30/schema/hypernymOf|https://w3id.org/own-pt/wn30/schema/hyponymOf|https://w3id.org/own-pt/wn30/schema/instanceOf|https://w3id.org/own-pt/wn30/schema/similarTo|https://w3id.org/own-pt/wn30/schema/substanceHolonymOf|https://w3id.org/own-pt/wn30/schema/substanceMeronymOf|https://w3id.org/own-pt/wn30/schema/memberHolonymOf|https://w3id.org/own-pt/wn30/schema/memberMeronymOf|https://w3id.org/own-pt/wn30/schema/partHolonymOf|https://w3id.org/own-pt/wn30/schema/participleOf|https://w3id.org/own-pt/wn30/schema/partMeronymOf|https://w3id.org/own-pt/wn30/schema/sameVerbGroupAs|https://w3id.org/own-pt/wn30/schema/seeAlso)" -v | egrep "/word-" -v | egrep "/wordsense-" -v > own-pt-split/own-en-synsets.nt
arademaker commented 3 years ago

Esta separação podia ser mais robusta. Mas pelo menos precisamos salvar estes comandos em algum bash script.

arademaker commented 3 years ago

E note que precisamos explicar melhor o passo a passo. Vc fala de um own-en-morpho que não está mais no repositório