allenai / deep_qa

A deep NLP library, based on Keras / tf, focused on question answering (but useful for other NLP too)
Apache License 2.0
404 stars 133 forks source link

Rename max_length dictionary members #238

Closed nelson-liu closed 7 years ago

nelson-liu commented 7 years ago

(Wanted to do this before it fell off my radar). This PR renames the canonical keys we use in the max_length dictionary (used to set the max lengths for padding) to match the class variables in text_trainer.

Fairly trivial PR, i just ran:

grep -rl "word_sequence_length" . | xargs sed -i 's/word_sequence_length/max_sentence_length/g'
grep -rl "word_character_length" . | xargs sed -i 's/word_character_length/max_word_length/g'

with a git commit after each.

renaming _get_word_sequence_lengths to _get_max_sentence_lengths was unintended, but I kind of like the new name --- thoughts? it should actually probably get a new name altogether since you can get both the max_sentence_length and the max_word_length from it...

matt-gardner commented 7 years ago

I think I would actually prefer to standardize the other way, removing "max" entirely. For example, the keys that you used in the AS Reader are formatted like num_*: https://github.com/allenai/deep_qa/blob/b000dc22924a15d0817062e6acc682851b718409/deep_qa/models/reading_comprehension/attention_sum_reader.py#L121-L124.