FluxML / Metalhead.jl

Computer vision models for Flux
https://fluxml.ai/Metalhead.jl
Other
328 stars 65 forks source link

UNet inchannels != 3 not working #243

Open nickkeepfer opened 1 year ago

nickkeepfer commented 1 year ago

It seems the UNet implementation only works with inputs with channel-size 3

Status `/private/var/folders/98/8xz00xq10hq7kdrdvdjd1jy00000gn/T/jl_EZ324U/Project.toml`
  [dbeba491] Metalhead v0.8.0
  [44cfe95a] Pkg v1.8.0
using Metalhead
UNet((128,128),1,3,Metalhead.backbone(DenseNet(121)))

ERROR: DimensionMismatch: layer Conv((7, 7), 3 => 64, pad=3, stride=2, bias=false) expects size(input, 3) == 3, but got 128×128×1×1 Array{Flux.NilNumber.Nil, 4}

theabhirath commented 1 year ago

Hi @nickkeepfer, as I understand it, the way U-Net is being constructed right now requires you to specify inchannels at two places - one for the function itself and another for the backbone. So

UNet((128,128),1,3,Metalhead.backbone(DenseNet(121; inchannels = 1)))

should hopefully yield the model that you want. Do let me know if this works the way you expect it to. If it does, there can be a discussion around reducing this redundancy.

nickkeepfer commented 1 year ago

Yep, that's exactly what I needed, thanks! From a new user's perspective, it would be better if the inchannels from UNet was automatically propagated to the backbone, rather than having to specify it twice. Or at least making this clear in the docs with an example that you do need to specify it twice

theabhirath commented 1 year ago

Yeah, it's definitely weird to keep it this way, I will try and land a PR soon to address this.