Open TushaarGVS opened 1 year ago
Thanks for the contribution! There are a few changes needed to get everything working with our github infrastructure:
setup.py
so that it gets automatically installed alongside convokit; right now the automated tests cannot run since this dependency is missing.In addition to the above, I have a few further minor suggestions related to style and consistency:
LanguageModel
, the "main" function is called cross_entropy
. But this is misleading since, as you note, the whole point of abstracting the language model is to allow the use of metrics other than cross entropy. This leads to the extremely confusing situation where the KenLM
subclass implements cross_entropy
, even though the implementation is (presumably) not actually computing cross entropy. We should therefore use a more generic name such as "score"ConvoKitLanguageModel
since it's a bit too nonspecific. In general, we should try to name the classes such that they are self-descriptive; i.e., a user could make an educated guess at what the class does by looking at the name. Since the purpose of the class is to implement the basic cross entropy score that used to be built in to Surprise
, I might suggest renaming the class as CrossEntropyLanguageModel
(after having resolved the other issue relating to the use of the term "cross entropy" as mentioned abovefile_utils.py
and utils.py
; I wonder if we should just merge their contents into the top-level util.py
that already exists in ConvoKit. This would reduce redundancy, and having taken a look at the new files I can easily imagine their functions being useful in places outside of Surprise
Thanks again for the contribution, hopefully these asks aren't too much. Let me know if anything is unclear!
Thanks Jonathan, very good points. One nore: I would like to not force KenLM as a requirement fit convokit. I think if LM modules like forecaster modules; so the same way we don’t force tensorflow for convokit, we should not force KenLM. This should be documented properly (as I believe we do that for the CRAFT forecaster module).
On Wed, Dec 14, 2022 at 16:09 Jonathan Chang @.***> wrote:
Thanks for the contribution! There are a few changes needed to get everything working with our github infrastructure:
- The KenLM code introduces a new dependency on (presumably) the kenlm python package. This dependency needs to be added to setup.py so that it gets automatically installed alongside convokit; right now the automated tests cannot run since this dependency is missing.
- Please automatically format the new code with the Black formatter to maintain consistency with the style guidelines. Instructions can be found in the contribution guidelines https://github.com/CornellNLP/ConvoKit/blob/master/CONTRIBUTING.md#contributing-code
In addition to the above, I have a few further minor suggestions related to style and consistency:
- In the abstract class specification for LanguageModel, the "main" function is called cross_entropy. But this is misleading since, as you note, the whole point of abstracting the language model is to allow the use of metrics other than cross entropy. This leads to the extremely confusing situation where the KenLM subclass implements cross_entropy, even though the implementation is (presumably) not actually computing cross entropy. We should therefore use a more generic name such as "score"
- I'm not a big fan of the name ConvoKitLanguageModel since it's a bit too nonspecific. In general, we should try to name the classes such that they are self-descriptive; i.e., a user could make an educated guess at what the class does by looking at the name. Since the purpose of the class is to implement the basic cross entropy score that used to be built in to Surprise, I might suggest renaming the class as CrossEntropyLanguageModel (after having resolved the other issue relating to the use of the term "cross entropy" as mentioned above
- This pull request adds two new files file_utils.py and utils.py; I wonder if we should just merge their contents into the top-level util.py that already exists in ConvoKit. This would reduce redundancy, and having taken a look at the new files I can easily imagine their functions being useful in places outside of Surprise
- I've noticed that the naming of private methods in the new code seems inconsistent, some use double underscores and others use a single underscore. We should standardize on a single naming convention; single underscore is used elsewhere in the code so we should stick with that
Thanks again for the contribution, hopefully these asks aren't too much. Let me know if anything is unclear!
— Reply to this email directly, view it on GitHub https://github.com/CornellNLP/ConvoKit/pull/187#issuecomment-1352005211, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACCMFE5RG5J2PNFRWE2BVSLWNILPHANCNFSM6AAAAAASZU7PRY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thanks for the extensive review, Jonathan, I will address the comments and post my questions before I push the final version of my changes for a PR.
In the abstract class specification for LanguageModel, the "main" function is called cross_entropy. But this is misleading since, as you note, the whole point of abstracting the language model is to allow the use of metrics other than cross entropy. This leads to the extremely confusing situation where the KenLM subclass implements cross_entropy, even though the implementation is (presumably) not actually computing cross entropy. We should therefore use a more generic name such as "score"
Jonathan, the "main" function is called evaluate(..., eval_type='')
that takes eval_type
as one of the arguments, where the user can specify the name of their evaluation function, and then that function gets called using eval_fn = getattr(self, eval_type)
in evaluate(...)
. The function cross_entropy
is a dummy function that is shown in LanguageModel
class, and it really needn't be implemented by the user, if not used (hence only a RuntimeError
, which throws an error only upon usage).
Also, KenLM returns log-likelihood score, which is essentially cross-entropy that we compute; hence, both our model and KenLM compute cross-entropy.
This pull request adds two new files file_utils.py and utils.py; I wonder if we should just merge their contents into the top-level util.py that already exists in ConvoKit. This would reduce redundancy, and having taken a look at the new files I can easily imagine their functions being useful in places outside of Surprise
This is a great idea, I will merge utils into the main utils file.
I'm not a big fan of the name ConvoKitLanguageModel since it's a bit too nonspecific. In general, we should try to name the classes such that they are self-descriptive; i.e., a user could make an educated guess at what the class does by looking at the name. Since the purpose of the class is to implement the basic cross entropy score that used to be built in to Surprise, I might suggest renaming the class as CrossEntropyLanguageModel (after having resolved the other issue relating to the use of the term "cross entropy" as mentioned above
It was initially called CrossEntropy
, but based on my answer above, it seems too restrictive to call it CrossEntropyLanguageModel
. I think we should come up with a better name.
I've noticed that the naming of private methods in the new code seems inconsistent, some use double underscores and others use a single underscore. We should standardize on a single naming convention; single underscore is used elsewhere in the code so we should stick with that
I have used _
to indicate semi-private variables; variables that aren't supposed to be accessible outside the class and __
for true private variables; those that can't be accessed (even if you wanted to) outside the class (at least not without _class_name__variable_name
). I want to leave that functionality as is, because I think it is a good programming practice, but let me know if this is a big issue concerning inconsistency, I will change it accordingly.
Thanks Jonathan, very good points. One note: I would like to not force KenLM as a requirement fit convokit. I think if LM modules like forecaster modules; so the same way we don’t force tensorflow for convokit, we should not force KenLM. This should be documented properly (as I believe we do that for the CRAFT forecaster module).
I have addressed this, Cristian. The code only installs kenlm
when installed with pip install convokit[kenlm]
.
I have finished documenting all the changes made, all the classes and functions now have extensive documentation. Also, following Jonathan’s comments, I have moved utility functions to the utils
file. Following your comment, I have ensured that kenlm
is not a direct dependency of convokit
, instead, it can be used by using pip install convokit[kenlm]
.
Additionally, I have also implemented perplexity()
function for both the language model classes (ConvoKitLanguageModel
and Kenlm
), in addition to cross_entropy()
. I have also run both the demos with the latest changes to ensure that I have not introduced any breaking changes.
Description
This change attempts to decouple the language model computes out of the Surprise transformer, so as to enable more flexibility in choosing several other language models, including KenLM, and in enhancing speed of these computations. Furthermore, this change introduces a
LanguageModel
class, and also implements twoLanguageModel
-inherited classes, including modifying the existing cross-entropy function to theConvoKitLanguageModel
class and theKenlm
class that uses the KenLM model. Furthermore, the computation of surprise has been thread-parallelized, and the language model evaluation functions themselves have also been thread-parallelized for runtime efficiency (although, this efficiency is quite minimal since we use threads and Python employs GIL to disallow for extensive thread-parallelization; any speed-up observed can be attributed to file I/O operations and NumPy operations; this is to be explored further).The
demos/surprise_demo.ipynb
anddemos/tennis_demo.ipynb
demo files have been updated to reflect the changes made. To ensure most possible backward compatibility, only thesmooth
argument has been removed from theSurprise
class signature; the fit and transform methods still run with the same arguments (usekwargs
to input arguments pertaining to added functionality). UseLanguageModel.config
to view the config of individualLanguageModel
classes, andLanguageModel.type
to view the type of the language model.Other changes include code modularization, utils addition, and other code formatting. TQDM imports have been modified to enable notebook rendering when using notebooks, and additional TQDM (disappearing) bars have been included to track the progress of surprise computations.
Motivation and context
This change was made to enable flexibility in language modeling. Currently, we use cross-entropy as a measure of how surprising the conversations are; however, it needn't be the case that cross-entropy has to be used. One may choose to employ perplexity, or KenLM log-probability scores. In this regard, decoupling surprise (language model) computes are of importance. This changes attempts to achieve the same.
How has this been tested?
The code has been tested by comparing this implementation to an earlier implementation, and with the same preprocessing, the same results were observed on the tennis dataset (working on the utterance-level). Furthermore, the cross-entropy refactoring has been tested using the old
ConvoKit
surprise demo, and (with some randomness in choosing the context and target) the outputs seem to be about the same, i.e., almost the same ordering in most and least surprising involvements. In the tennis demo, with the use of cross-entropy, the values remained the same.Usage
Create a new
LanguageModel
class object as follows:Use the created
LanguageModel
when transforming as follows:Pending
Documentation and final testing (estimated: December 15, 2022).