CAIIVS / chuchichaestli

Where you find all the state-of-the-art cooking utensils (salt, pepper, gradient descent... the usual).
GNU General Public License v3.0
3 stars 0 forks source link

Question model config (attention) #5

Closed mstadelmann closed 3 months ago

mstadelmann commented 4 months ago

How do I have to set up the attention blocks; what does the parameter add_attention do, and how does it link to the use of AttnUp/Down blocks?

If I set it to true, I get

Traceback (most recent call last):
  File "/usr/local/bin/vlf", line 33, in <module>
    sys.exit(load_entry_point('vlf-core', 'console_scripts', 'vlf')())
  File "/home/user/dev/vlf-core/vlf/run_experiment.py", line 59, in main
    experiment.prepareTraining()
  File "/home/user/dev/vlf-core/vlf/experiment.py", line 342, in prepareTraining
    self.createModel()
  File "/home/user/dev/vlf-core/vlf/experiment.py", line 324, in createModel
    self.networkModel = currentNetArch.createNetwork(self)
  File "/home/user/dev/vlf-core/networks/unet_chuchichaestli.py", line 94, in createNetwork
    model = UNet(
  File "/home/user/dev/chuchichaestli/src/chuchichaestli/models/unet/unet.py", line 152, in __init__
    self.mid_block = DIM_TO_BLOCK_MAP[dimensions][mid_block_type](
  File "/home/user/dev/chuchichaestli/src/chuchichaestli/models/unet/unet_2d_blocks.py", line 189, in __init__
    Attention(
TypeError: Attention.__init__() got an unexpected keyword argument 'upcast_softmax'
mstadelmann commented 4 months ago

(In case that is relevant, this happens in 2D.)

bil-y commented 4 months ago

add_attention only controls the attention layer in the bottleneck (called middle block in the code). That's something we inherited from hugging face that I haven't come around to change yet; and it clearly doesn't seem to work as I've slimmed the attention blocks drastically in terms of parameters 😅

For an example of attention in up/down blocks, have a look at the tests. It's done by using AttnDownBlock / AttnUpBlock in the down_block_types and up_block_types, respectively

mstadelmann commented 4 months ago

OK, thanks!