Open vinaybagade opened 3 years ago
cc @VibhuJawa
View / edit / reply to this conversation on ReviewNB
VibhuJawa commented on 2021-09-01T23:16:11Z ----------------------------------------------------------------
Line #1. !pip install hvplot
Might be worth commenting this out to keep the notebook clean in a run.
Also worth adding s3fs
.
Line #3. !!pip install s3fs
View / edit / reply to this conversation on ReviewNB
VibhuJawa commented on 2021-09-01T23:16:11Z ----------------------------------------------------------------
This code adds complexity.
I suggest switching to below, Its much faster and only uses cudf . Also might be useful to switch to below, it only takes 30 seconds.
%%time input_bucket = 's3://amazon-reviews-pds' input_path = '/parquet/product_category=Office_Products/*.parquet' df = cudf.read_parquet(input_bucket+input_path, storage_options={'anon': True}, columns = ['star_rating','review_body'])
View / edit / reply to this conversation on ReviewNB
VibhuJawa commented on 2021-09-01T23:16:12Z ----------------------------------------------------------------
Add a markdown section explaing that you are merging reviews here.
View / edit / reply to this conversation on ReviewNB
VibhuJawa commented on 2021-09-01T23:16:13Z ----------------------------------------------------------------
Add a section here explaining that you are gonna be tokenizing here.
View / edit / reply to this conversation on ReviewNB
VibhuJawa commented on 2021-09-01T23:16:13Z ----------------------------------------------------------------
Add a section to say you are starting a Data Loader
View / edit / reply to this conversation on ReviewNB
VibhuJawa commented on 2021-09-01T23:16:14Z ----------------------------------------------------------------
Line #2. from transformers import BertModel
Mark Down section explaining that you are creating a model here.
View / edit / reply to this conversation on ReviewNB
VibhuJawa commented on 2021-09-01T23:16:14Z ----------------------------------------------------------------
Line #3. def train_model(model, data_loader, loss_fn, optimizer, scheduler, n_examples):
Add a markdown section explaining training loop. .
View / edit / reply to this conversation on ReviewNB
VibhuJawa commented on 2021-09-01T23:16:15Z ----------------------------------------------------------------
Line #1. ## Use some custom examples
Add a markdown section with custom examples.
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB