pystruct / pyqpbo

QPBO interface and alpha expansion for Python
23 stars 14 forks source link

[question]Why int? #3

Closed kondra closed 10 years ago

kondra commented 10 years ago

Why do you use integers? Why not doubles?

In pystruct there is a code like (1000*unary).toint, I think this may decrease quality of inference.

amueller commented 10 years ago

Because Vladimir writes in the documentation that the graph cut with integers is more stable. I haven't really tried. Actually I never ran the QPBO code with floats. Have you tried that?

kondra commented 10 years ago

I use doubles in another Kolmogorov & Boykov library for graphcuts, I use it through matlab mex wrapper. (maxflow package from Kolmogorov web page) I've written some code which uses this wrapper (on which my own matlab implementation of alpha-expansion is based) and Joachims svmstruct through Andrea Vedaldi matlab wrapper. With all this I try to solve a simple structured prediction problem on small grid graphs, and I have ~15% hamming loss on test set. I've tried to rewrite all this in pystruct, and I have ~25% hamming loss on test set. I've tried to modify pyx files to use double, and also removed conversion to integers in inference_methods.py (pystruct), but I think I did something wrong, because it does not work at all.

amueller commented 10 years ago

Well, you will also have to change the QPBO itself to use floats. I think the reason for the discrepancy is more likely that the C is scaled differently: To convert the svm-struct C to the pystruct C, you have to multiply by n_samples (and possibly by 100 because for example svm^multi scales the loss by 100 for no reason that I understand).

amueller commented 10 years ago

Btw, you could also try with the ad3 solver, which will take longer but should give you much better accuracies. What kind of model are you using and which solver?

kondra commented 10 years ago

QPBO is templated, so I think there is nothing to change there.

I use modified EdgeFeatureCRF, I have 10 labels, unaries are different for different labels, but pairwise are the same. It is the same as in my matlab code. I use one slack solver.

amueller commented 10 years ago

Have you tried adjusting C? And are you sure the OneSlackSSVM solver converged?

amueller commented 10 years ago

I can have a look at QPBO tomorrow, but I really don't think the problem is there. As I said, you could try ad3 (just pip install ad3). If you have trouble reproducing your matlab results, I'd really like to get to the bottom of this.

kondra commented 10 years ago

Tried to adjust C, with small C 0.1 ... 0.0001 it converges in 150-200 iterations. With C=0.1, primal objective is 2823, and the gap is -11, I think it's ok.

I'm going to make some comparison in inference quality without learning to determine if the problem is really in inference, and I'll try to make the results more clear and reproducible.

amueller commented 10 years ago

Hehe ok if the gap is -11 then the inference doesn't find any more constraints. So changing C does not help in improving performance? Thanks a lot for investigating, I am really interested in the outcome. Also, feel free to sent me the data if you can ;) What application are you working on?

kondra commented 10 years ago

Yep it doesn't help. I think I have a bug in my own CRF model implementation. I've tried to use ad3, it's very very slow, I ctrl-C it with the gap 0.0001, but it has very bad performance: 45% error. Btw I think using integers is really not an issue, I've compared my matlab code for stereo reconstruction (which uses doubles) and python code with the same potentials - the result is almost the same, there was no significant drop of performance.

The project is about latent ssvm and weakly labeled data (it'll be the topic of my master's thesis).

amueller commented 10 years ago

So let's close this, ok? I would love to get an update on how your work is going and if pystruct is helping, if you keep using it. If you are having any troubles, feel free to write to the mailing list.

Btw, you have seen that there are two latent variable solvers, right? For weakly labeled data, the subgradientLatentSSVM might work, which hasn't been used in the literature yet afaik.