d2l-ai / d2l-en

Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
https://D2L.ai
Other
23.95k stars 4.36k forks source link

[Question] How to determine train, test iteration and vocab? #2141

Closed teohsinyee closed 2 years ago

teohsinyee commented 2 years ago

Greetings.

I want to fit this code with my personal dataset. Not the IMDB ones. The load_data_imdb() returns 3 parameters. To fit my dataset in, I've to set the 3 parameters right? But what value should I put?

Reference: https://classic.d2l.ai/chapter_natural-language-processing/sentiment-analysis.html#put-all-things-together

from torch import nn
from d2l import torch as d2l

batch_size = 64

train_iter, test_iter, vocab = d2l.load_data_imdb(batch_size) 
AnirudhDagar commented 2 years ago

To use your own dataset, you will need to change the read_imdb function to read_your_custom_data function and then call it from load_data_custom.

teohsinyee commented 2 years ago

load_data_custom

Hi @AnirudhDagar, do you mind providing the URLs to the documentation for both methods read_your_custom_data and load_data_custom? I tried to search but can't find it.

image
AnirudhDagar commented 2 years ago

@teohsinyee there is no load_data_custom, I just meant that you may want to refactor load_data_imdb method to the new dataset name. Logic inside the function will remain the same.