drivendataorg / zamba

A Python package for identifying 42 kinds of animals, training custom models, and estimating distance from camera trap videos
https://zamba.drivendata.org/docs/stable/
MIT License
118 stars 27 forks source link

Typer integration with rich causes error messages that are too long #233

Closed pjbull closed 2 years ago

pjbull commented 2 years ago

After typer integrated with rich for fancier tracebacks, whenever our CLI errors, it prints the entire model architecture to the console, taking hundreds of lines. We should look into truncating or turning off this behavior.

For example, last ~100 lines look like:

│ │                   │   │   │   328, eps=1e-05, momentum=0.1, affine=True,                     │ │
│ │                   track_running_stats=True                                                   │ │
│ │                   │   │   │   (drop): Identity()                                             │ │
│ │                   │   │   │   (act): Identity()                                              │ │
│ │                   │   │     )                                                                │ │
│ │                   │   │     (drop_path): Identity()                                          │ │
│ │                   │   │   )                                                                  │ │
│ │                   │   │   (23): InvertedResidual(                                            │ │
│ │                   │   │     (conv_pw): Conv2d(328, 1968, kernel_size=(1, 1), stride=(1, 1),  │ │
│ │                   bias=False)                                                                │ │
│ │                   │   │     (bn1): BatchNormAct2d(                                           │ │
│ │                   │   │   │   1968, eps=1e-05, momentum=0.1, affine=True,                    │ │
│ │                   track_running_stats=True                                                   │ │
│ │                   │   │   │   (drop): Identity()                                             │ │
│ │                   │   │   │   (act): SiLU(inplace=True)                                      │ │
│ │                   │   │     )                                                                │ │
│ │                   │   │     (conv_dw): Conv2d(1968, 1968, kernel_size=(3, 3), stride=(1, 1), │ │
│ │                   padding=(1, 1), groups=1968, bias=False)                                   │ │
│ │                   │   │     (bn2): BatchNormAct2d(                                           │ │
│ │                   │   │   │   1968, eps=1e-05, momentum=0.1, affine=True,                    │ │
│ │                   track_running_stats=True                                                   │ │
│ │                   │   │   │   (drop): Identity()                                             │ │
│ │                   │   │   │   (act): SiLU(inplace=True)                                      │ │
│ │                   │   │     )                                                                │ │
│ │                   │   │     (se): SqueezeExcite(                                             │ │
│ │                   │   │   │   (conv_reduce): Conv2d(1968, 82, kernel_size=(1, 1), stride=(1, │ │
│ │                   1))                                                                        │ │
│ │                   │   │   │   (act1): SiLU(inplace=True)                                     │ │
│ │                   │   │   │   (conv_expand): Conv2d(82, 1968, kernel_size=(1, 1), stride=(1, │ │
│ │                   1))                                                                        │ │
│ │                   │   │   │   (gate): Sigmoid()                                              │ │
│ │                   │   │     )                                                                │ │
│ │                   │   │     (conv_pwl): Conv2d(1968, 328, kernel_size=(1, 1), stride=(1, 1), │ │
│ │                   bias=False)                                                                │ │
│ │                   │   │     (bn3): BatchNormAct2d(                                           │ │
│ │                   │   │   │   328, eps=1e-05, momentum=0.1, affine=True,                     │ │
│ │                   track_running_stats=True                                                   │ │
│ │                   │   │   │   (drop): Identity()                                             │ │
│ │                   │   │   │   (act): Identity()                                              │ │
│ │                   │   │     )                                                                │ │
│ │                   │   │     (drop_path): Identity()                                          │ │
│ │                   │   │   )                                                                  │ │
│ │                   │     )                                                                    │ │
│ │                   │   )                                                                      │ │
│ │                   │   (conv_head): Conv2d(328, 2152, kernel_size=(1, 1), stride=(1, 1),      │ │
│ │                   bias=False)                                                                │ │
│ │                   │   (bn2): BatchNormAct2d(                                                 │ │
│ │                   │     2152, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True │ │
│ │                   │     (drop): Identity()                                                   │ │
│ │                   │     (act): SiLU(inplace=True)                                            │ │
│ │                   │   )                                                                      │ │
│ │                   │   (global_pool): SelectAdaptivePool2d (pool_type=avg,                    │ │
│ │                   flatten=Flatten(start_dim=1, end_dim=-1))                                  │ │
│ │                   │   (classifier): Identity()                                               │ │
│ │                     ))                                                                       │ │
│ │                     (classifier): Sequential(                                                │ │
│ │                   │   (0): Linear(in_features=2152, out_features=256, bias=True)             │ │
│ │                   │   (1): Dropout(p=0.2, inplace=False)                                     │ │
│ │                   │   (2): ReLU()                                                            │ │
│ │                   │   (3): Linear(in_features=256, out_features=64, bias=True)               │ │
│ │                   │   (4): Flatten(start_dim=1, end_dim=-1)                                  │ │
│ │                   │   (5): Linear(in_features=1024, out_features=32, bias=True)              │ │
│ │                     )                                                                        │ │
│ │                   )                                                                          │ │
│ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
ejm714 commented 2 years ago

@pjbull Can you give me an example line that errored?

pjbull commented 2 years ago

Yeah, this was just from the error I saw running training that is filed in #234. I think if you put a raise anywhere that the model itself is in local scope, you'll see this output.

That one errored here: https://github.com/drivendataorg/zamba/blob/eba2cec58b990725920f177ef68fcde4af3eecc9/zamba/models/model_manager.py#L207-L209