Set kernel_size when creating model

DeepTrackAI / DeepTrack2

DeepTrack2

MIT License

157 stars 47 forks source link

Set kernel_size when creating model #178

Open elindgren opened 1 year ago

elindgren commented 1 year ago

Ideally one would want to be able to not only set the number of filters for each convolutional layer, but also their kernel size. The workaround atm is to just pass the convolutional block you want to encoder_convolution_block, but it would be nice to pass that as a tuple.

(This is related to the NFFY314 course; Exercise 4.3 asks one to change the kernel size, which is quite cumbersome atm)

Suggested API

model = dt.models.EncoderDecoder(
   ...
   conv_layers_dimensions=(64, 64, 64),
   kernel_size=(3, 5, 7),
   ...

https://github.com/softmatterlab/DeepTrack2/blob/dbb2ab2752e228906a8d6623fdf408ffa40f1826/deeptrack/models/convolutional.py#L518

BenjaminMidtvedt commented 1 year ago

I agree conceptually, and I'd be amenable to accepting this change. However, it is not scalable. There are just too many ways one may want to modify the architecture to cover them all as parameters. A perhaps more palatable formulation is:

# only set channels
model = dt.models.EncoderDecoder(
   ...
   conv_layers_params=(64, 64, 64)
   ...
)
# Set channels and kernels
model = dt.models.EncoderDecoder(
   ...
   conv_layers_params=[
      dict(filters=64, kernel_size=3),
      dict(filters=64, kernel_size=5),
      dict(filters=64, kernel_size=7)
   ]
   ...
)

Which would scale somewhat better, since additional parameters such as the activation function, stride, padding, etc. could be passed as needed.

elindgren commented 1 year ago

I agree conceptually, and I'd be amenable to accepting this change. However, it is not scalable. There are just too many ways one may want to modify the architecture to cover them all as parameters. A perhaps more palatable formulation is:
# only set channels
model = dt.models.EncoderDecoder(
   ...
   conv_layers_params=(64, 64, 64)
   ...
)
# Set channels and kernels
model = dt.models.EncoderDecoder(
   ...
   conv_layers_params=[
      dict(filters=64, kernel_size=3),
      dict(filters=64, kernel_size=5),
      dict(filters=64, kernel_size=7)
   ]
   ...
)
Which would scale somewhat better, since additional parameters such as the activation function, stride, padding, etc. could be passed as needed.

That's fair, I think such a solution would also work well. My main reason for suggesting such a change was just to make the API more clear to students of the course NFFY314, who may not be familiar with Deeptrack.

BenjaminMidtvedt commented 1 year ago

It's a good point, and I agree. I'll discuss with the group internally how such a change can be made without breaking backward compatibility.