nickduran / align-linguistic-alignment

Python library for extracting quantitative, reproducible metrics of multi-level alignment between speakers in naturalistic language corpora.
MIT License
40 stars 12 forks source link

Potential Python 3 function incompatibility for `calculate_alignment.py` #47

Closed a-paxton closed 4 years ago

a-paxton commented 4 years ago

One of the functions required for align.calculate_alignment---dict.iteritems()---was deprecated in Python 3. After changing it to the new function---dict.items()---we get the following error:

[turn_real,convo_real] = align.calculate_alignment(
                            input_files=INPUT_FILES,
                            maxngram=MAXNGRAM,   
                            use_pretrained_vectors=USE_PRETRAINED_VECTORS,
                            pretrained_input_file=PRETRAINED_INPUT_FILE,
                            semantic_model_input_file=SEMANTIC_MODEL_INPUT_FILE,
                            output_file_directory=ANALYSIS_READY,
                            add_stanford_tags=ADD_STANFORD_TAGS,
                            ignore_duplicates=IGNORE_DUPLICATES,
                            high_sd_cutoff=HIGH_SD_CUTOFF,
                            low_n_cutoff=LOW_N_CUTOFF)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-33-a98b9a179c5a> in <module>
      9                             ignore_duplicates=IGNORE_DUPLICATES,
     10                             high_sd_cutoff=HIGH_SD_CUTOFF,
---> 11                             low_n_cutoff=LOW_N_CUTOFF)

~/GitHub/align-linguistic-alignment/align/calculate_alignment.py in calculate_alignment(input_files, output_file_directory, semantic_model_input_file, pretrained_input_file, high_sd_cutoff, low_n_cutoff, delay, maxngram, use_pretrained_vectors, ignore_duplicates, add_stanford_tags, input_as_directory)
    821                                                        use_pretrained_vectors=use_pretrained_vectors,
    822                                                        high_sd_cutoff=high_sd_cutoff,
--> 823                                                        low_n_cutoff=low_n_cutoff)
    824 
    825     # create containers for alignment values

~/GitHub/align-linguistic-alignment/align/calculate_alignment.py in BuildSemanticModel(semantic_model_input_file, pretrained_input_file, use_pretrained_vectors, high_sd_cutoff, low_n_cutoff)
    143         contentWords = [word for word in frequency.keys()]
    144     else:
--> 145         getOut = np.mean(frequency.values())+(np.std(frequency.values())*(high_sd_cutoff))
    146         contentWords = {word: freq for word, freq in frequency.items() if freq < getOut}.keys()
    147 

/anaconda3/envs/align_testing/lib/python3.7/site-packages/numpy/core/fromnumeric.py in mean(a, axis, dtype, out, keepdims)
   3116 
   3117     return _methods._mean(a, axis=axis, dtype=dtype,
-> 3118                           out=out, **kwargs)
   3119 
   3120 

/anaconda3/envs/align_testing/lib/python3.7/site-packages/numpy/core/_methods.py in _mean(a, axis, dtype, out, keepdims)
     85             ret = ret.dtype.type(ret / rcount)
     86     else:
---> 87         ret = ret / rcount
     88 
     89     return ret

TypeError: unsupported operand type(s) for /: 'dict_values' and 'int'