Open wangze1219 opened 9 months ago
Please pip install d2l==0.17.6
to use older version of d2l which has these saved functions. In the latest version, we refactored the code and removed them.
What a nightmare. So much backward incompatibility! Could you mention which version to use at the beginning of each chapter?
If you are using the latest version of the book then it should work with the latest d2l package.
import math import os import random import torch from d2l import torch as d2l import os import matplotlib.pyplot as plt
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
@save
d2l.DATA_HUB['ptb'] = (d2l.DATA_URL + 'ptb.zip', '319d85e578af0cdc590547f26231e4e31cdf1e42')
@save
def read_ptb(): """将PTB数据集加载到文本行的列表中""" data_dir = d2l.download_extract('ptb')
Readthetrainingset.
sentences = read_ptb() f'# sentences数: {len(sentences)}' vocab = d2l.Vocab(sentences, min_freq=10) f'vocab size: {len(vocab)}'
@save
def subsample(sentences, vocab): """下采样高频词"""
排除未知词元''
subsampled, counter = subsample(sentences, vocab) d2l.show_list_len_pair_hist( ['origin', 'subsampled'], '# tokens per sentence', 'count', sentences, subsampled) plt.show()