A neural network based file sorter. Trains an autoencoder to sort images or audio based on the similarity of their encodings, or uses the OpenAI CLIP model.
at minimum this must be its own class, options in the dataloader for dealing with audio, and the necessary methods for working with audio data (windowed FTs?, 1d convolutions, transformers?)
at minimum this must be its own class, options in the dataloader for dealing with audio, and the necessary methods for working with audio data (windowed FTs?, 1d convolutions, transformers?)