tpietruszka / ulmfit_experiments

0 stars 0 forks source link

Some questions about reproducing #1

Open zkysfls opened 4 years ago

zkysfls commented 4 years ago

Hello! I want to reproduce your experiment and I encounter some problems. Firstly, for the data, I only see that there are only config JSON file in the repo, so is that mean I have to download the data somewhere else and re-organize like this? 批注 2020-07-31 210314 Secondly, I do not see the model code, where can I find it? Third, I want to test it on another dataset, so should I write a config file like the example config file? If so, how can I set parameter for training phase and aggregation_class? Looking forward for your reply!

tpietruszka commented 4 years ago

Hi! Sorry for the delayed response. ad 1, I think you do not need the header, otherwise it looks good. Or maybe labels need to be "0", and "1", I do not recall. Anyway, it was the default format that fastai library downloads the IMDB dataset in. ad 2, the model code sits in classifiers.py and sequence_aggregations.py. It is just the classifier head, as the fastai default encoder is used. ad 3, I'd just starts with https://github.com/tpietruszka/ulmfit_experiments/blob/master/example_configs/imdb_full_agg_1.json and change the dataset path. aggregation_class can stay the same for sure, you might need to modify numers of epochs and learning rates in each phase... but I think this is reasonable as a starting point.

Overall though, this repo is more up to date, more cleaned-up: https://github.com/tpietruszka/ulmfit_attention so I would start there.