btgraham / SparseConvNet-archived

Spatially-sparse convolutional networks. Allows processing of sparse 2, 3 and 4 dimensional data.Build CNNs on the square/cubic/hypercubic or triangular/tetrahedral/hyper-tetrahedral lattices.
https://github.com/btgraham/SparseConvNet/wiki
405 stars 122 forks source link

how about Using linear regression? #8

Closed tengshaofeng closed 7 years ago

tengshaofeng commented 7 years ago

hi, btgrahma. I am so appreciated with your great work. But, i am a little confused with the following sentence in your paper: "Using linear regression on just the CNN generated probability distributions, without any of the meta-data, also seems to work well."

linear regression is done based features probability distributions? for example, you have 3 nets and repeat random transformation 3 times, in addition left-right eye, so you have 332*5=90 dimension of features for linear regression?

btgraham commented 7 years ago

Re: the Diabetic Retinopathy competition The 5 probabilities produced by softmax add to 1, so there are only 4 degrees of freedom (drop one of the outputs). If you have multiple networks, you can just average them. Using both the eye being tested, and the other eye, that gives 4+4=8 features. You can then use linear regression to approximate the class.

tengshaofeng commented 7 years ago

to left eye ,for example the output average of three nets is (0.1, 0.1, 0.1, 0.1, 0.6), then drop which one? to right eye ,for exaple (0.1, 0.1, 0.1, 0.05, 0.65), so do you say that the 8 features are (0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.05 )? I can not understand.

btgraham commented 7 years ago

Say the probabilities are (L[0],L[1],L[2],L[3],L[4]) for the left eye and (R[0],R[1],R[2],R[3],R[4]) for the right eye. Use (L[1],L[2],L[3],L[4],R[1],R[2],R[3],R[4]) to predict the left eye and (R[1],R[2],R[3],R[4],L[1],L[2],L[3],L[4]) to predict the right eye.

tengshaofeng commented 7 years ago

thanks for your reply。 three questions 1)why drop the L(0)?
2) linear regress with features(L[1],L[2],L[3],L[4],R[1],R[2],R[3],R[4]) with labels rates.
used the trained model to map a float value. and set the threshold to decide which rate? for example <0.5 map to 0 3) have you tried just average of output of multi nets and repeat random tests? how much is kappa score?

tengshaofeng commented 7 years ago

and the fourth question is: after you preprocess the image. the value is around 128. image

and whether needed normalization? like (x-mean)/std or x/255.0?

thanks so much

btgraham commented 7 years ago

1) L(0) adds no extra information (linearly dependent) 2) Use cross validation to decide the thresholds. 3) Similar to using forests. 4) Only the corners should look like that. The center should be more interesting.

tengshaofeng commented 7 years ago

thank you very much for your help. It really helped me. but to the fourth question I am not confused about the 128 around the corner. I am confused about that if I should do normalization such as (img-mean)/std before input to the cnn net

btgraham commented 7 years ago

I think I scaled to [-1, +1], but other scalings could work.

tengshaofeng commented 7 years ago

ok, thank you very much. so kind of you .

tengshaofeng commented 7 years ago

@btgraham sorry to bother you again. i am reading your paper . image I know the radius is the eye's. but i do not know the exact size of input image. is it 540*540 if radius is 270?

btgraham commented 7 years ago

Input size should be >= 540x540. Larger sizes were used to allow data augmentation (translations).

tengshaofeng commented 7 years ago

for example, image of 540540 is randomly cropped from image of 600600. then take the image of 540*540 to the input?