nyu-mll / jiant-v1-legacy

The jiant toolkit for general-purpose text understanding models
MIT License
21 stars 9 forks source link

[CLOSED] Fixing Edge-Probing after muti-GPU release #1025

Closed jeswan closed 4 years ago

jeswan commented 4 years ago

Issue by pruksmhc Monday Mar 02, 2020 at 22:29 GMT Originally opened as https://github.com/nyu-mll/jiant/pull/1025


https://github.com/nyu-mll/jiant/issues/1017 Modifying edge-probing to fit into new update_metrics function signature. There's still a weird error going on with edgeprobing on Multi-GPU so I raised an error.


pruksmhc included the following code: https://github.com/nyu-mll/jiant/pull/1025/commits

jeswan commented 4 years ago

Comment by pyeres Tuesday Mar 03, 2020 at 14:51 GMT


Thanks @pruksmhc — two requests for information in the PR description:

  1. For which use cases / configurations do these changes provide a fix? (e.g., with this fix will the edge probing tasks work on a machine with only one GPU? How about on a machine with multiple GPUs when only one GPU ID is specified in the cuda config field?)
  2. How were these changes tested or validated?
jeswan commented 4 years ago

Comment by pyeres Tuesday Mar 03, 2020 at 23:45 GMT


PyTorch magic: looks like the weird exception you were seeing can be addressed by wrapping the lists of modules (under the projs and span_extractors fields) in a nn.ModuleList() like this:

        if self.is_symmetric or self.single_sided:
            # Use None as dummy padding for readability,
            # so that we can index projs[1] and projs[2]
            self.projs = nn.ModuleList([None, self.proj1, self.proj1])
        else:
            # Separate params for span2
            self.proj2 = self._make_cnn_layer(d_inp)
            self.projs = nn.ModuleList([None, self.proj1, self.proj2])

        # Span extractor, shared for both span1 and span2.
        self.span_extractor1 = self._make_span_extractor()
        if self.is_symmetric or self.single_sided:
            self.span_extractors = nn.ModuleList([None, self.span_extractor1, self.span_extractor1])
        else:
            self.span_extractor2 = self._make_span_extractor()
            self.span_extractors = nn.ModuleList([None, self.span_extractor1, self.span_extractor2])

After that the forward needs to be updated, but it looks like it can run under multi-GPU.

jeswan commented 4 years ago

Comment by pruksmhc Wednesday Mar 04, 2020 at 19:10 GMT


  1. This fix will allow edge probing tasks to work in a single-GPU case. I only tested on a machine with one GPU. However, if there are multiple GPUs and only one GPU iD is specified, it should be the case that only 1 GPU is used (and DataParallel is not used). If that is not the case, that is case for a more major investigation on our DataParallel implementation.
  2. I ran this fix for edges-dpr. All the probing datasets have the same API (two sentences, a multi-classification task), and passes through the same code (which is what I changed in this PR), thus I only tested for one task. I can run some more validation checks on other edge probing tasks.
jeswan commented 4 years ago

Comment by pyeres Thursday Mar 05, 2020 at 15:56 GMT


  1. This fix will allow edge probing tasks to work in a single-GPU case. I only tested on a machine with one GPU. However, if there are multiple GPUs and only one GPU iD is specified, it should be the case that only 1 GPU is used (and DataParallel is not used). If that is not the case, that is case for a more major investigation on our DataParallel implementation.
  2. I ran this fix for edges-dpr. All the probing datasets have the same API (two sentences, a multi-classification task), and passes through the same code (which is what I changed in this PR), thus I only tested for one task. I can run some more validation checks on other edge probing tasks.

@pruksmhc — thanks for info!

I'm planning to open a (WIP) PR on this branch today to extend your fix to cover multi-GPU.

jeswan commented 4 years ago

Comment by pyeres Thursday Mar 05, 2020 at 19:40 GMT


I'm planning to open a (WIP) PR on this branch today to extend your fix to cover multi-GPU.

@pruksmhc — I extended your fix to cover multi-GPU mode (w/ PR #1029). If my changes look OK please merge them into this branch and I'll plan to validate our changes together by comparing:

  1. Edge probing experiment run w/ single-GPU using jiant 1.2.1
  2. Edge probing experiment run w/ single-GPU using this branch
  3. Edge probing experiment run w/ multi-GPU using this branch
jeswan commented 4 years ago

Comment by pyeres Friday Mar 06, 2020 at 22:19 GMT


@pruksmhc — I extended your fix to cover multi-GPU mode (w/ PR #1029). If my changes look OK please merge them into this branch and I'll plan to validate our changes together by comparing:

  1. Edge probing experiment run w/ single-GPU using jiant 1.2.1
  2. Edge probing experiment run w/ single-GPU using this branch
  3. Edge probing experiment run w/ multi-GPU using this branch

fyi, here are results on edges-ner-ontonotes* showing consistent performance w/ these changes:

version GPUs macro avg. mcc acc. precision recall f1
1.2.1 1 0.972 0.970 0.970 0.973 0.971 0.972
1.3.0 @ 0c3d128 1 0.971 0.969 0.969 0.971 0.970 0.971
1.3.0 @ 0c3d128 2 0.971 0.969 0.969 0.972 0.970 0.971

Please let me know if there are any other experiments I can help run.

* using base_edgeprobing.conf w/ overrides batch_size=4, reload_vocab=1, do_pretrain=0, do_target_task_training=1, do_full_eval=1, lr=5e-6, dropout=0.1, random_seed=42, input_module=roberta-large, pretrain_tasks=none, target_tasks=edges-ner-ontonotes

jeswan commented 4 years ago

Comment by pruksmhc Sunday Mar 08, 2020 at 12:49 GMT


Merged! Also, I did smoke tests on all edge-probing and Senteval tasks on single-GPU @pyeres, so I think all that should be done is doing smoke tests for multi-GPU. Given that we've already done extensive performance validation tests, I don't think for the sake of time we need to do them for all of edge probing and Senteval.

jeswan commented 4 years ago

Comment by pyeres Monday Mar 09, 2020 at 17:20 GMT


Merged! Also, I did smoke tests on all edge-probing and Senteval tasks on single-GPU @pyeres, so I think all that should be done is doing smoke tests for multi-GPU. Given that we've already done extensive performance validation tests, I don't think for the sake of time we need to do them for all of edge probing and Senteval.

Great, thanks @pruksmhc — I'll kick off runs for edge probing tasks w/ multi-GPU. Can you show me an example of the kind of run config you used for Senteval tasks? (I'll copy it for multi-GPU).

jeswan commented 4 years ago

Comment by pruksmhc Monday Mar 09, 2020 at 17:38 GMT


export TM_PROBING_TASK_NAMES=(edges-spr1 edges-spr2 edges-dpr se-probing-bigram-shift) for i in "${TM_PROBING_TASK_NAMES[@]}" do python main.py --config_file "jiant/config/taskmaster/base_roberta.conf" --overrides "max_vals=0, exp_name=edges-check, batch_size=2, reload_vocab=1, do_pretrain=0, do_target_task_training=1, do_full_eval=1, lr=5e-6, dropout=0.1, random_seed=42, input_module=roberta-large, pretrain_tasks=$i ,target_tasks=$i, run_name=$i, val_interval=100,target_train_val_interval=50, max_vals=1, target_train_max_vals=1" done ~

jeswan commented 4 years ago

Comment by pyeres Tuesday Mar 10, 2020 at 12:48 GMT


export TM_PROBING_TASK_NAMES=(edges-spr1 edges-spr2 edges-dpr se-probing-bigram-shift) for i in "${TM_PROBING_TASK_NAMES[@]}" do python main.py --config_file "jiant/config/taskmaster/base_roberta.conf" --overrides "max_vals=0, exp_name=edges-check, batch_size=2, reload_vocab=1, do_pretrain=0, do_target_task_training=1, do_full_eval=1, lr=5e-6, dropout=0.1, random_seed=42, input_module=roberta-large, pretrain_tasks=$i ,target_tasks=$i, run_name=$i, val_interval=100,target_train_val_interval=50, max_vals=1, target_train_max_vals=1" done ~

Thanks, @pruksmhc — confirmed multi-GPU validation runs don't show regression.