justindomke / marbl

Marginal based learning of Conditional Random Field parameters
MIT License
6 stars 1 forks source link

Image segmentation #8

Closed thuanvh closed 8 years ago

thuanvh commented 9 years ago

Hi @justindomke , Could you put an example about using marbl for image segmentation? Maybe using image of JGMT. Regards, Thuan

justindomke commented 9 years ago

Hi Thuan,

I'd like to do that, but it's a ton of work to get something into a good tutorial format, and I'm really busy with other things at the moment, so I definitely can't promise anything in the near future.

If you put together an example, however, I'd be very happy to include it! :)

cheers, Justin

On 6 Jan 2015, at 3:40 pm, thuanvh notifications@github.com wrote:

Hi @justindomke , Could you put an example about using marbl for image segmentation? Maybe using image of JGMT. Regards, Thuan

— Reply to this email directly or view it on GitHub.

thuanvh commented 9 years ago

Hi Justin, I am very glad if I could do it. Now I wonder how to represent an image in graph of feature node? And using it for marbl training? Could you help me to identify that?

justindomke commented 9 years ago

I'm not sure if this is the answer you are looking for, but typically, problems on images involve pairwise graphs. So, you'd want to create a graph with one node for each pixel, and then one region for each pixel and each adjacent pair of pixels.

e.g. on a 3x3 graph you might use the layout

0 3 6 1 4 7 2 5 8

and regions

0 1 2 3 4 5 6 7 8 0 1 1 2 0 3 1 4 2 5 3 4 4 5 3 6 4 7 5 8 6 7 7 8

Hope this helps! Justin

On 7 Jan 2015, at 2:26 pm, thuanvh notifications@github.com wrote:

Hi Justin, I am very glad if I could do it. Now I wonder how to represent an image in graph of feature node? And using it for marbl training? Could you help me to identify that?

— Reply to this email directly or view it on GitHub.

thuanvh commented 9 years ago

That is for one pixel for each region, what if I use a superpixel (blob) region for graph. How to represent the features of each region as input of Marbl?

For a specified example, I want to use example_backgrounds from JGMT in Marbl. Could I use the features extracted in JGMT for Marbl training?

justindomke commented 9 years ago

You certainly could, yes. At the moment, the best documentation for how to input the features is the examples, e.g.:

https://github.com/justindomke/marbl/blob/master/examples/chain_learning.md

On 7 Jan 2015, at 3:23 pm, thuanvh notifications@github.com wrote:

That is for one pixel for each region, what if I use a superpixel (blob) region for graph. How to represent the features of each region as input of Marbl?

For a specified example, I want to use example_backgrounds from JGMT in Marbl. Could I use the features extracted in JGMT for Marbl training?

— Reply to this email directly or view it on GitHub.

thuanvh commented 9 years ago

It is very clear in generating model.txt and data.txt. If I have 8 class, so the first line in model.txt should be: 4 8 8 8 8 (instead of 4 2 2 2 2), and the last line of data.txt should be label of each node. Is it right?

justindomke commented 9 years ago

Yep, that looks exactly right to me.

On 7 Jan 2015, at 4:14 pm, thuanvh notifications@github.com wrote:

It is very clear in generating model.txt and data.txt. If I have 8 class, so the first line in model.txt should be: 4 8 8 8 8 (instead of 4 2 2 2 2), and the last line of data.txt should be label of each node. Is it right?

— Reply to this email directly or view it on GitHub.

thuanvh commented 9 years ago

Ok, I will try it.

On Wed, Jan 7, 2015 at 1:02 PM, justindomke notifications@github.com wrote:

Yep, that looks exactly right to me.

On 7 Jan 2015, at 4:14 pm, thuanvh notifications@github.com wrote:

It is very clear in generating model.txt and data.txt. If I have 8 class, so the first line in model.txt should be: 4 8 8 8 8 (instead of 4 2 2 2 2), and the last line of data.txt should be label of each node. Is it right?

— Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/justindomke/marbl/issues/8#issuecomment-68982541.

thuanvh commented 9 years ago

Hi @justindomke , I am trying to segment images. With marbl, is it possible to learn each model for each trained sample? In your example, I find that example1, 2, 3 are the same. In my case, each image has only the same size, same number of superpixel regions. So marbl could learned that?

justindomke commented 9 years ago

Hi,

Sure, absolutely. You can give different models for each training example. In the examples I don't do this, but there is no problem if you do. The "tying" of parameters is all done through the "types" of the factors, so you just constrain different factors to have the same type when you want them to share parameters. This allows a lot of flexibility in how you attack different problems.

cheers, Justin

On 12 Jan 2015, at 7:00 pm, thuanvh notifications@github.com wrote:

Hi @justindomke , I am trying to segment images. With marbl, is it possible to learn each model for each trained sample? In your example, I find that example1, 2, 3 are the same. In my case, each image has only the same size, same number of superpixel regions. So marbl could learned that?

— Reply to this email directly or view it on GitHub.

thuanvh commented 9 years ago

Hi, I run a test on my data. After run some first iterations, the loss value become nan. L-BFGS optimization terminated with status code = -1000 fx = nan, x[0] = -0.997497, x[1] = 0.127171 Which parameter could I change so that Loss is not NAN?

thuanvh commented 9 years ago

Oh, It's my fault. I generated a wrong type (0, 1) of node and edge in my code.

justindomke commented 9 years ago

Yes, I was going to say that, usually, nans are a sign that "the user is doing something wrong". However, ideally the code would have better error checking and error messages to give you more guidance.

On 13 Jan 2015, at 4:24 pm, thuanvh notifications@github.com wrote:

Oh, It's my fault. I generated a wrong type (0, 1) of node and edge in my code.

— Reply to this email directly or view it on GitHub.

thuanvh commented 9 years ago

Hi, I tested marbl with my data, the precision obtained is about 0.5. I wonder how to calculate the loss value? In marbl, the loss calculation is the count of different labels with ground truth??? I see in CRF.cpp, the loss is calculated as following:

L = 0;
for(int alpha=0; alpha<mu.size(); alpha++){
     if(y_configs(alpha)==-1) continue; // skip when y_configs == -1
      L -= log(mu[alpha](y_configs(alpha)))/gradnorm;
      dlogmu[alpha].setZero();
      dlogmu[alpha](y_configs(alpha)) = -1/gradnorm;
      if(L!=L) cout << "badmu: " << mu[alpha].transpose() << endl;
}

How could I change mu[alpha](y_configs(alpha)) into a predicted label?

thuanvh commented 9 years ago

Hi @justindomke , The output of CRF inference is a marginal. Could we calculate the predicted label from marginal? Now I estimate the label is the maximum of node probability in marginal. Could we use edge probability in marginal for labeling estimation?

justindomke commented 9 years ago

Using the node marginals as you are doing is probably the best bet. That's traditionally known as "maximum posterior marginal".

On 15 Jan 2015, at 7:14 pm, thuanvh notifications@github.com wrote:

Hi @justindomke , The output of CRF inference is a marginal. Could we calculate the predicted label from marginal? Now I estimate the label is the maximum of node probability in marginal. Could we use edge probability in marginal for labeling estimation?

— Reply to this email directly or view it on GitHub.

thuanvh commented 9 years ago

Hi Justin, I have read https://github.com/justindomke/marbl/blob/master/examples/chain_inference.md . The following text mentions the order of factor :

Why put the factors in this order? This is because inference in this toolbox proceeds in the order specified here. Namely, one iterations corresponds to, first, updating all factors in the order given, and then updating all factors in the reverse of the order given. For a chain, this means that all messages will have converged in a single iteration

This order is used for a simple chain (1-2-3-4)? So with a more complex graph like: 1 2 3 4 5 6 7 8 9 , that order is important or not? That means, if I change the order of node and edge in data.txt, the result will be changed or not?

Thank you,

justindomke commented 9 years ago

Yes, the order matters no matter what graph you use. If you use a large number of message-passing iterations, the messages will probably converge to the same thing, regardless of the order. (Though this depends a bit on what entropy you use.) With a small number of message-passing iterations, it varies.

On 16 Jan 2015, at 1:03 pm, thuanvh notifications@github.com wrote:

Hi Justin, I have read https://github.com/justindomke/marbl/blob/master/examples/chain_inference.md . The following text mentions the order of factor :

Why put the factors in this order? This is because inference in this toolbox proceeds in the order specified here. Namely, one iterations corresponds to, first, updating all factors in the order given, and then updating all factors in the reverse of the order given. For a chain, this means that all messages will have converged in a single iteration

This order is used for a simple chain (1-2-3-4)? So with a more complex graph like: 1 2 3 4 5 6 7 8 9 , that order is important or not? That means, if I change the order of node and edge in data.txt, the result will be changed or not?

Thank you,

— Reply to this email directly or view it on GitHub.

eswears commented 8 years ago

Justin,

Thanks for referring me to your MARBL toolbox, it does seem more general and user friendly than JGMT.

I read through the examples and you use Bethe entropy to set the entropies for pairwise graphs. Can this also be used for graphs that are not pairwise? If so, how would one go about calculating it for larger cliques? Also, what exactly is the entropy trying to capture here?

Thanks, Eran

justindomke commented 8 years ago

Hi Eran,

The Bethe entropy can certainly be used for non-pairwise graphs. It's basically the same formula: an entropy of one for all non-singleton factors, and an entropy of 1-degree for all singleton factors.

If you want to be even fancier, you can go to "generalized belief propagation" which is a class of algorithms that have larger intersections between factors than just singletons. Marbl can handle many of these entropies as well.

eswears commented 8 years ago

Thanks Justin. How is "degree" defined? In the chain_inference.md example "degree" is the number of pairs that are" touching". Is this the number of cliques that a single node is shared across?

-Eran

justindomke commented 8 years ago

Yes, exactly: The degree of a node is the number of (non-singleton) factors in which it is contained. (More formally, I guess we could speak of a hypergraph and hyperedges.)