I had to change two small errors that I just realized during model training:
We use tf.data.Dataset.sample_from_datasets(datasets=train_datasets, weights=sample_probs, stop_on_empty_dataset=True) to sample from the datasets during multi-dataset training. Before we didn't set stop_on_empty_dataset=True which resulted in the dataloader to first sample from datasets X, Y, Z according to sample_probs until one or more of the datasets ran out of samples, then it continued to draw samples from the remaining datasets until all are empty and the dataloader is restarted. In this PR we set stop_on_empty_dataset=True which restarts the dataloader once a dataset is empty.
In evaluation_script.py we load external datasets for model evaluation. The argparsing functionality I implemented didn't work correctly, resulting in a failure to load these external datasets. This is now fixed.
What is the purpose of this PR?
I had to change two small errors that I just realized during model training:
tf.data.Dataset.sample_from_datasets(datasets=train_datasets, weights=sample_probs, stop_on_empty_dataset=True)
to sample from the datasets during multi-dataset training. Before we didn't setstop_on_empty_dataset=True
which resulted in the dataloader to first sample from datasets X, Y, Z according tosample_probs
until one or more of the datasets ran out of samples, then it continued to draw samples from the remaining datasets until all are empty and the dataloader is restarted. In this PR we setstop_on_empty_dataset=True
which restarts the dataloader once a dataset is empty.evaluation_script.py
we load external datasets for model evaluation. The argparsing functionality I implemented didn't work correctly, resulting in a failure to load these external datasets. This is now fixed.How did you implement your changes
Changed only 2-3 lines of code..
Remaining issues
None