quadrismegistus / prosodic

Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.
http://quadrismegistus.github.io/prosodic/
GNU General Public License v3.0
274 stars 41 forks source link

python-Levenshtein can't build wheel #20

Closed tnhaider closed 1 month ago

tnhaider commented 3 years ago

Just a heads up: I cannot install prosodic because pip fails to build a wheel for python-Levenshtein.

I looks like multiple people have this problem, but nothing is being done about it: https://github.com/ztane/python-Levenshtein/issues

Is there a chance you could switch to a different implementation of Levenshtein distance?

Cheers.

quadrismegistus commented 3 years ago

Thanks for pointing this out. Will try to fix today. Sorry to leave prosodic a little rusty btw: let me see if I can respond to these issues soon.

quadrismegistus commented 3 years ago

Are you on Mac OSX?

quadrismegistus commented 3 years ago

I'm looking around and I can't find any levenshtein packages that do exactly what prosodic needs besides the one included here. Also reluctant to give up its written-in-C speeds.

What I could do is try to get a conda-based installation of prosodic working. That would allow conda install python-levenshtein which should be able to install a precompiled binary for any OS. Not sure if that's overkill for this problem though

tnhaider commented 3 years ago

Thanks for your quick reply.

I'm on CentOS Linux in a Cluster. ;)

Conda might actually work. I've been able to install levenshtein through that.

And there is no hurry. I implemented my own data driven prosody detection system for poetry, and I'd like to use prosodic as baseline. Your litlab dataset has already proved quite useful.

Best, TH

quadrismegistus commented 3 years ago

Ok, let me know if you run into any issues installing prosodic via conda? If it works, I may look into adding prosodic as a conda package so it can list its conda dependencies.

Also, wow, that sounds really interesting. The world of computational prosody is small so it's always nice to hear of new stuff going on! Are you working primarily with German poetry? And are you taking a rule-based or a machine learning approach? Would love to chat more sometime.

tnhaider commented 3 years ago

I am working for English and German both.

And my current best system is a multi-task bilstm-crf with pretrained syllable embeddings.

We also annotated a sizeable amount of gold data for both languages. I can send you the current manuscript via mail if you'd like.

quadrismegistus commented 3 years ago

Very interesting! I'd love to read more. I'm @ rj416@cam.ac.uk

polm commented 3 years ago

I repackged the levensthein library to use wheels so you don't have to compile it. I haven't tested it much but you can install it with the command below, maybe it'll fix your issue.

pip install levenshtein
tnhaider commented 3 years ago

Hi Ryan,

Thanks for having a look at it.

Unfortunately, the dependency in prosodic is still python-Levensthein, so the install of prosodic fails regardless.

Is there a way how I can change the dependency, or should I just compile the latest build?

Thanks, Tom

tnhaider commented 3 years ago

Alright, I just got your new levenshtein, cloned the repo and started prosodic.py itself. Seems to work so far.

tnhaider commented 3 years ago

Ok, it seems I also get a 'tagged_samples' error:

>> [0.0s] prosodic:en$ /corpus ../corpora/corppoetry_en/en.whitman.txt

    [please type a line of text, or enter one of the following commands:]
        /text   load a text
        /corpus load folder of texts
        /paste  enter multi-line text

        /show   show annotations on input
        /tree   see phonological structure
        /query  query annotations

        /parse  parse metrically
        /meter  set the meter used for parsing
        /eval   evaluate this meter against a hand-tagged sample
        /maxent learn weights for meter using maxent

        /save   save previous output to file (except for /weight and /weight2; see /weightsave)
        /scan   print out the scanned lines
        /report look over the parse outputs
        /stats  save statistics from the parser

        /mute   hide output from screen
        /exit   exit

>> [17.88s] prosodic:en$ /eval
Traceback (most recent call last):
  File "prosodic.py", line 558, in <module>
    path=os.path.join(dir_prosodic,config['folder_tagged_samples'])
KeyError: 'folder_tagged_samples'

The same happens if I execute prosodic.py from the parent directory.

I tried renaming the path setup to 'tagged_samples', but it just changes the key name, that is also not found.

557                 elif text.startswith('/eval'):
558                         path=os.path.join(dir_prosodic,config['folder_tagged_samples'])
559                         fn=None

I am not sure what do with this in the config:

# ############################################
# @DEPRECATED
# # PATHS USED BY PROSODIC
# #
# # If these are relative paths (no leading /),
# # they are defined from the point of view of
# # the root directory of PROSODIC.
# #
# # Folder used as the folder of corpora:
# # [it should contain folders, each of which contains text files]
# folder_corpora='corpora/'
# #
# # Folder to store results within (statistics, etc)
# folder_results='results/'
# #
# # Folder in which tagged samples (hand-parsed lines) are stored:
# folder_tagged_samples = 'tagged_samples/'
# ############################################
quadrismegistus commented 3 years ago

hm, let me look into all this. does uncommenting the # folder_tagged_samples = 'tagged_samples/' line in prosodic/config.py change things? are you using pip version? if so, does installing from repo (pip install -U git+https://github.com/quadrismegistus/prosodic) help? Been a while since I dove back into code; will try to do that now

All best, Ryan

On Jan 23 2021, at 12:33 pm, philaut notifications@github.com wrote:

Ok, it seems I also get a 'tagged_samples' error:

[0.0s] prosodic:en$ /corpus ../corpora/corppoetry_en/en.whitman.txt

[please type a line of text, or enter one of the following commands:] /text load a text /corpus load folder of texts /paste enter multi-line text

/show show annotations on input /tree see phonological structure /query query annotations

/parse parse metrically /meter set the meter used for parsing /eval evaluate this meter against a hand-tagged sample /maxent learn weights for meter using maxent

/save save previous output to file (except for /weight and /weight2; see /weightsave) /scan print out the scanned lines /report look over the parse outputs /stats save statistics from the parser

/mute hide output from screen /exit exit

[17.88s] prosodic:en$ /eval Traceback (most recent call last): File "prosodic.py", line 558, in path=os.path.join(dir_prosodic,config['folder_tagged_samples']) KeyError: 'folder_tagged_samples' — You are receiving this because you commented. Reply to this email directly, view it on GitHub (https://github.com/quadrismegistus/prosodic/issues/20#issuecomment-765958424), or unsubscribe (https://github.com/notifications/unsubscribe-auth/AAFTFHIJKUET3AK5CFGVD43S3K62TANCNFSM4SICHDBA).

tnhaider commented 3 years ago

Will try asap.

If we can get this done by monday, you will be in the experiments section of my paper which got accepted to EACL btw. :)

I basically just need to figure out how I can evaluate my manual annotation against prosodic. I want to determine the accuracy of the meter annotation on syllable and line level.

tnhaider commented 3 years ago

Nope, pip install -U git+https://github.com/quadrismegistus/prosodic doesn't work because it wants to install `python-Levensthein' for which I can't build the wheel.

Uncommenting folder_tagged_samples = 'tagged_samples/' in the config I get either

>> [0.0s] prosodic:en$ /text tagged_samples/tagged-sample-litlab-2016.txt
<file not found>

or

>> [24.65s] prosodic:en$ /eval ../tagged_samples/tagged-sample-litlab-2016.txt
Traceback (most recent call last):
  File "prosodic.py", line 563, in <module>
    for _fn in os.listdir(path):
FileNotFoundError: [Errno 2] No such file or directory: 'tagged_samples/'

Loading text with /text corppoetry_en/en.shakesspeare.txt does work however. Then doing a /parse also works. But then doing /eval gets me an error.

>> [8.67s] prosodic:en$ /eval
Traceback (most recent call last):
  File "prosodic.py", line 563, in <module>
    for _fn in os.listdir(path):
FileNotFoundError: [Errno 2] No such file or directory: 'tagged_samples/'
tnhaider commented 3 years ago

With importing it into my my own script, it looks promising now.

Nice tutorial btw!

quadrismegistus commented 3 years ago

@polm Thanks for your help! So should I just change "python-Levenshtein" in requirements.txt to "levenshtein"?

tnhaider commented 3 years ago

Yes, that should solve the problem.

I am not sure about the config path change though. I might have a look at that later.

polm commented 3 years ago

@quadrismegistus Sure, that should work.

I am not sure about the future of that particular pip package yet. The version that's there now won't go away, but maybe development will resume at the old name later.

quadrismegistus commented 2 years ago

Did we ever figure this out?

polm commented 2 years ago

My wheels are still up if you want to use them and I don't see that changing, but I haven't done any other work on the project and don't have time to work on it going forward.

For the original levenshtein project, the maintainer contacted me about taking over, but by that point I was already busy with other work and had to refuse. Someone else stepped up not long after and volunteered to be maintainer but hasn't gotten a response.

You can follow progress here.

https://github.com/ztane/python-Levenshtein/issues/61