We can try this more this more powerful CNN architecture on our data. There should be a lot of resources for implementing an out of the box ResNet-18. (Res = residual, Net=network, 18 stands for how many layers are used). Torchvision provides a prebuilt model, but they do expect images to have 3 channels. We might try to adapt that or code our own.
There is a tutorial here and the original paper for this residual architecture can be found here if anyone is interested. The layers of this look like so:
We can try this more this more powerful CNN architecture on our data. There should be a lot of resources for implementing an out of the box ResNet-18. (Res = residual, Net=network, 18 stands for how many layers are used). Torchvision provides a prebuilt model, but they do expect images to have 3 channels. We might try to adapt that or code our own.
There is a tutorial here and the original paper for this residual architecture can be found here if anyone is interested. The layers of this look like so: