delip / PyTorchNLPBook

Code and data accompanying Natural Language Processing with PyTorch published by O'Reilly Media https://amzn.to/3JUgR2L
Apache License 2.0
1.98k stars 807 forks source link

Removes loops in creating data subset #3

Closed NikhilPr95 closed 5 years ago

NikhilPr95 commented 5 years ago

Uses pandas groupby and filtering functionality to create subset dataframe in place of manual creation through loops.

Also improves efficiency of execution from Wall time of over 1 min to 120ms

image

delip commented 5 years ago

Hi @NikhilPr95, Nice observation! @braingineer and I intentionally kept the code verbose in many places for the sake of clarity. Here's an excerpt from the preface of the book:

A note regarding the style of the book. We have intentionally avoided mathematics in most places, not because deep learning math is particularly difficult (it is not), but because it is a distraction in many situations from the main goal of this book—to empower the beginner learner. Likewise, in many cases, both in code and text, we have favored exposition over succinctness. Advanced readers and experienced programmers will likely see ways to tighten up the code and so on, but our choice was to be as explicit as possible so as to reach the broadest of the audience that we want to reach.

Thank you for this input.