I tried hacking the code to increase the depth, but while I could get it loading and eliminated the missing/dropped keys error (by modifying the depth values across all the files), there was clearly a render error- sort of recognizable shapes with a lot of noise and distortion. Maybe someone smarter than me can take a crack at this?
PsuedoTerminal is doing some really interesting work with expanding the model depth from 28 -> 42 and finetuning on new data, for example: https://huggingface.co/ptx0/pixart-900m-1024-ft
I tried hacking the code to increase the depth, but while I could get it loading and eliminated the missing/dropped keys error (by modifying the depth values across all the files), there was clearly a render error- sort of recognizable shapes with a lot of noise and distortion. Maybe someone smarter than me can take a crack at this?