When 'instances_per_epoch' is set up in the class MultiTaskDataLoader, the function __len__ in it will return a wrong answer.

Checklist

[ ] I have verified that the issue exists against the main branch of AllenNLP.
[ ] I have read the relevant section in the contribution guide on reporting bugs.
[ ] I have checked the issues list for similar or identical bug reports.
[ ] I have checked the pull requests list for existing proposed fixes.
[ ] I have checked the CHANGELOG and the commit log to find out if the bug was already fixed in the main branch.
[ ] I have included in the "Description" section below a traceback from any exceptions related to this bug.
[ ] I have included in the "Related issues or possible duplicates" section beloew all related issues and possible duplicate issues (If there are none, check this box anyway).
[ ] I have included in the "Environment" section below the name of the operating system and Python version that I was using when I discovered this bug.
[ ] I have included in the "Environment" section below the output of pip freeze.
[ ] I have included in the "Steps to reproduce" section below a minimally reproducible example.

Description

When i perform multi-task learning with allennlp, i config the MultiTaskDataLoader as following: I set 'instances_per_epoch' to 8000 and 'batch_size' to 16. I expect that there are about 500 steps in an epoch. However, when i run my codes, the process bar shows that there are 3000 steps. But actully, there isn't that much. After 502 steps, the epoch completed.

After checking, i find that the following codes in MultiTaskDataLoader is wrong:

From the __init__ function in MultiTaskDataLoader, we can know that when 'instances_per_epoch' is set, the sampler will also be provided.

So, when we count instances for each dataset, we should take into consideration the proportion of each dataset provided by the sampler. Thus, the aforementioned wrong codes should be replaced by the following codes:

Here is the codes: `

    dataset_proportions = self.sampler.get_task_proportions(self._loaders)

    proportion_sum = sum(dataset_proportions.values())

    num_instances_per_dataset = {

        key: math.floor(proportion * self._instances_per_epoch / proportion_sum)

        for key, proportion in dataset_proportions.items()

    }

Python traceback:

``` ```

Related issues or possible duplicates

None

Environment

OS: Linux

Python version: 3.7.13 Allennlp version: 2.10.1

Output of pip freeze:

``` ```

Steps to reproduce

Example source:

``` ```

allenai / allennlp