Closed alakel closed 1 year ago
Dans MyWebIntelligencePython,
Ajouter à la fonction
Export land type = ['pagecsv', 'pagegexf', 'fullpagecsv', 'nodecsv', 'nodegexf', 'mediacsv']
[venv/bin/]$ python mywi.py land export --name=LAND_NAME --type=EXPORT_TYPE --minrel=MINIMUM_RELEVANCE
... une option 'corpus' qui génère une série de fichier txt au format yaml compatible avec My Doc
sur ce modèle tiré de R
for (i in 1:nrow(fullpage)) {
write.table(paste("---\nTitle: \"",gsub("\"", "'", fullpage$title[i]),"\"\nCreator: \"\"\nContributor: \"\"\nCoverage: \"\"\nDate: \"\"\nDescription: \"",gsub("\"", "'", fullpage$description[i]),"\"\nSubject: \"",gsub("\"", "'", fullpage$doctopic_cluster[i]),"\"\nType: \"\"\nFormat: \"\"\nIdentifier: \"",gsub("\"", "'", fullpage$id[i]),"\"\nLanguage: \"\"\nPublisher: \"",gsub("\"", "'", fullpage$domain_name[i]),"\"\nRelation: \"\"\nRights: \"\"\nSource: \"",gsub("\"", "'", fullpage$url[i]),"\"\n---\n",gsub("\"", "'", fullpage$readable[i]), sep=""), paste("ProjectName", fullpage$id[i], ".txt", sep = ""), quote = FALSE, row.names = F, col.names = F ) print(i) }
Puis ziper le tout pour téléchargement.
Attention aux guillements emboîtés dans les variables pour la compatibilité Yaml
Fait dans la release 1.3 de MyWebIntelligencePython
Dans MyWebIntelligencePython,
Ajouter à la fonction
Export land type = ['pagecsv', 'pagegexf', 'fullpagecsv', 'nodecsv', 'nodegexf', 'mediacsv']
[venv/bin/]$ python mywi.py land export --name=LAND_NAME --type=EXPORT_TYPE --minrel=MINIMUM_RELEVANCE
... une option 'corpus' qui génère une série de fichier txt au format yaml compatible avec My Doc
sur ce modèle tiré de R
for (i in 1:nrow(fullpage)) {
write.table(paste("---\nTitle: \"",gsub("\"", "'", fullpage$title[i]),"\"\nCreator: \"\"\nContributor: \"\"\nCoverage: \"\"\nDate: \"\"\nDescription: \"",gsub("\"", "'", fullpage$description[i]),"\"\nSubject: \"",gsub("\"", "'", fullpage$doctopic_cluster[i]),"\"\nType: \"\"\nFormat: \"\"\nIdentifier: \"",gsub("\"", "'", fullpage$id[i]),"\"\nLanguage: \"\"\nPublisher: \"",gsub("\"", "'", fullpage$domain_name[i]),"\"\nRelation: \"\"\nRights: \"\"\nSource: \"",gsub("\"", "'", fullpage$url[i]),"\"\n---\n",gsub("\"", "'", fullpage$readable[i]), sep=""), paste("ProjectName", fullpage$id[i], ".txt", sep = ""), quote = FALSE, row.names = F, col.names = F ) print(i) }
Puis ziper le tout pour téléchargement.
Attention aux guillements emboîtés dans les variables pour la compatibilité Yaml