Closed solomesolo closed 4 years ago
Many images in the dataset have a smaller scan than the image itself. It is necessary that the neural network focuses on the bone itself and does not take into account the empty zones around. For this, a script was written to determine the zone of interest and enlarge the image of the bone itself.
To increase accuracy, it is necessary to more accurately align the position of the bone in the picture, relative to the image. To do this, we need to pre-process the images by changing not only its scale, but also rotation, and possibly perspective. To do this, we can create a network that will determine the key points in the image for subsequent alignment with the template of these key points. We can also create a neural network that will predict the necessary parameters of the affinity conversion to obtain a picture that is most convenient for the next neural network that determines the anomaly. We can use an already trained network to analyze the impact of changing perspectives (augment images) on the accuracy of abnormality detection. And in this way, get a dataset for training the image alignment network. But it’s best to train the end-to-end network using, for example, the STN transformer, as a learning unit that does affine transformations within the network.
Created DataLoader which preprocess and loads all the dataset into RAM. This has accelerated the training of the network almost twice and the training of one era now takes about 10 minutes. Further, the dataset will be expanded and new methods of image augmentation will be added, so this time will grow, but not much.
Tried to create multiprocessing or multithreading data loading to decrease data loading time. Now with added preprocessing (scaling to the size of bone) it's 11 minutes but failed for now. We can preprocess the whole dataset and save it, and load preprocessed dataset after. But this is not rational now, as the preprocessing process sometimes changes. This will need to be done when the method of preprocessing the dataset is fixed, as well as when it will be necessary to conduct experiments with training neural networks with different architectures, parameters and training methods
Dataset images property analysis, writing script for data loading and data preprocessing.