MyWebIntelligence / MyDocClient

MIT License
2 stars 0 forks source link

fonction export option 'corpus' #50

Closed alakel closed 1 year ago

alakel commented 2 years ago

Dans MyWebIntelligencePython,

Ajouter à la fonction


Export land type = ['pagecsv', 'pagegexf', 'fullpagecsv', 'nodecsv', 'nodegexf', 'mediacsv']

[venv/bin/]$ python mywi.py land export --name=LAND_NAME --type=EXPORT_TYPE --minrel=MINIMUM_RELEVANCE


... une option 'corpus' qui génère une série de fichier txt au format yaml compatible avec My Doc

sur ce modèle tiré de R

for (i in 1:nrow(fullpage)) {

write.table(paste("---\nTitle: \"",gsub("\"", "'", fullpage$title[i]),"\"\nCreator: \"\"\nContributor: \"\"\nCoverage: \"\"\nDate: \"\"\nDescription: \"",gsub("\"", "'", fullpage$description[i]),"\"\nSubject: \"",gsub("\"", "'", fullpage$doctopic_cluster[i]),"\"\nType: \"\"\nFormat: \"\"\nIdentifier: \"",gsub("\"", "'", fullpage$id[i]),"\"\nLanguage: \"\"\nPublisher: \"",gsub("\"", "'", fullpage$domain_name[i]),"\"\nRelation: \"\"\nRights: \"\"\nSource: \"",gsub("\"", "'", fullpage$url[i]),"\"\n---\n",gsub("\"", "'", fullpage$readable[i]), sep=""), paste("ProjectName", fullpage$id[i], ".txt", sep = ""), quote = FALSE, row.names = F, col.names = F ) print(i) }

Puis ziper le tout pour téléchargement.

Attention aux guillements emboîtés dans les variables pour la compatibilité Yaml

neoxium commented 1 year ago

Fait dans la release 1.3 de MyWebIntelligencePython