nickduran / align-linguistic-alignment

Python library for extracting quantitative, reproducible metrics of multi-level alignment between speakers in naturalistic language corpora.
MIT License
40 stars 12 forks source link

python 3 branch failing #41

Closed LudvigOlsen closed 5 years ago

LudvigOlsen commented 5 years ago

Running the CHILDES example with python 3.7 branch. Getting this error when calling the model_store = align.prepare_transcripts( chunk:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-24-a11e0da2ba9f> in <module>()
      9                         # stanford_pos_path=STANFORD_POS_PATH,
     10                         # stanford_language_path=STANFORD_LANGUAGE,
---> 11                         save_concatenated_dataframe=True)

~/anaconda/envs/deeplearning/lib/python3.6/site-packages/align/prepare_transcripts.py in prepare_transcripts(input_files, output_file_directory, training_dictionary, minwords, use_filler_list, filler_regex_and_list, add_stanford_tags, stanford_pos_path, stanford_language_path, input_as_directory, save_concatenated_dataframe)
    492 
    493     # train our spell-checking model
--> 494     nwords = train(re.findall('[a-z]+', (file(training_dictionary).read().lower())))
    495 
    496     # grab the appropriate files

NameError: name 'file' is not defined

The file() function should probably be substituted with the open() function.

The print statements in the examples are python 2 as well. Easy to fix though. :)

nickduran commented 5 years ago

Ah, thanks for helping troubleshoot! You're completely right. The tutorials in the Python3 branch will not work because they are based on the Python 2.7 package of ALIGN. I know, confusing. But if you look at the actual guts of the new code (prepare_transcripts) in the "align" folder, you will see that all of the appropriate Python2 to 3 changes have been made. It's this new Python3-friendly code we need to push to PyPI. Our developer (Alex) has been super busy but it should be soon.

I also need to update the README as well on the Python3 branch as changes have been made to the Stanford POS tagger that might be throwing errors. This should all be done soon. Once it is, I will make the Python3 branch the master branch and send out a message that all is ready to go. Cheers, Nick

On Thu, Oct 3, 2019 at 2:48 AM Ludvig Renbo Olsen notifications@github.com wrote:

Running the CHILDES example with python 3.7 branch. Getting this error when calling the model_store = align.prepare_transcripts( chunk:


NameError Traceback (most recent call last)

in () 9 # stanford_pos_path=STANFORD_POS_PATH, 10 # stanford_language_path=STANFORD_LANGUAGE, ---> 11 save_concatenated_dataframe=True) ~/anaconda/envs/deeplearning/lib/python3.6/site-packages/align/prepare_transcripts.py in prepare_transcripts(input_files, output_file_directory, training_dictionary, minwords, use_filler_list, filler_regex_and_list, add_stanford_tags, stanford_pos_path, stanford_language_path, input_as_directory, save_concatenated_dataframe) 492 493 # train our spell-checking model --> 494 nwords = train(re.findall('[a-z]+', (file(training_dictionary).read().lower()))) 495 496 # grab the appropriate files NameError: name 'file' is not defined The file() function should probably be substituted with the open() function. The print statements in the examples are python 2 as well. Easy to fix though. :) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub , or mute the thread .

-- Nicholas Duran, PhD Assistant Professor Barrett Honors Faculty Arizona State University School of Social and Behavioral Sciences Lab Website: DynamiCog.org

LudvigOlsen commented 5 years ago

I was probably a bit too quick to jump on the py3 branch! :-) Were using it in class today (after Riccardo's inspiring lecture), so we had 30 people trying to use it. Using the py2 version now instead! Best, Ludvig

nickduran commented 5 years ago

That is really great to hear! I will try my best to get you the most up to date version for your students soon.

On Thu, Oct 3, 2019 at 3:16 PM Ludvig Renbo Olsen notifications@github.com wrote:

I was probably a bit too quick to jump on the py3 branch! :-) Were using it in class today (after Riccardo's inspiring lecture), so we had 30 people trying to use it. Using the py2 version now instead! Best, Ludvig

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nickduran/align-linguistic-alignment/issues/41?email_source=notifications&email_token=ABJEUHLWUMBXCZBHVPTZI2TQMZVLJA5CNFSM4I5A2CP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAJYLGY#issuecomment-538150299, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJEUHMT2GGTGPE5XNVCNJ3QMZVLJANCNFSM4I5A2CPQ .

-- Nicholas Duran, PhD Assistant Professor Barrett Honors Faculty Arizona State University School of Social and Behavioral Sciences Lab Website: DynamiCog.org

LudvigOlsen commented 5 years ago

More like @fusaroli's students, as I'm just one in the flock ;) It might have been a one-time thing, but perhaps some might like use ALIGN for their exam projects at the end of the year! Py 2.7 version is easy to use with anaconda though! I'm sure Riccardo will be throwing it at students for years to come :)