qdii / tatoeba_parser

A program that parses the tatoeba database
Other
10 stars 5 forks source link

How to use the python interface? #5

Closed da-liii closed 11 years ago

da-liii commented 11 years ago

Environment: Archlinux i686 with the latest autotools

my attempts $ autoreconf -i $ ./configure --enable-python $ make $ cd src/.libs $ python

import libtatoparser
Traceback (most recent call last): File "", line 1, in ImportError: dynamic module does not define init function (initlibtatoparser)
import tatoparser
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named tatoparser

my attempts with the example code snippet for Boost.Python Documents: https://gitcafe.com/sadhen/lxthw-notes/tree/master/python/boost it works well

qdii commented 11 years ago

Hey, Thanks for the bug report I am working on it.

qdii commented 11 years ago

You are performing the right steps, but please note that you will have to rename libtatoparser.so into tatoparser.so so that the python interpreter finds it. Here is an exemple of script you could use:

import tatoparser
import re

# the regular expression that will be matched against the sentences
regular_exp = re.compile('.*[hH]ey.*')

# a sentence container that is filled up by tatoparser.parse
all_sentences = tatoparser.dataset()

# an object that knows whether two sentences are linked
all_links = tatoparser.linkset()

# an object that knows which tags a sentence hold
all_tags = tatoparser.tagset()

# an object that knows which sentence are in a given list
all_lists = tatoparser.listset()

tatoparser.init(0)
tatoparser.parse( all_sentences, all_links, all_tags, all_lists, "sentences.csv", "links.csv", "tags.csv", "lists.csv" )

for i in range( all_sentences.size() ):
    current_sentence = all_sentences.getByIndex(i)
    if (current_sentence.lang() == 'eng' and regular_exp.match( current_sentence.str() )):
        print current_sentence.id(), current_sentence.lang(), current_sentence.str()

tatoparser.terminate()
da-liii commented 11 years ago

I suggest that you should rename the module "tatoparser" to "libtatoparser" so that I don't have to rename the library.

qdii commented 11 years ago

Hey sadhen, please feel free to open an issue, even for trivial things like that. I changed tatoparser to libtatoparser in revision a3c44e5e1312b4dacd4d0208d9fe91104def94e8