laxris / flexfringe-colab

Colab Notebook for flexfringe
5 stars 0 forks source link

Alergia algo #2

Closed pberko closed 3 years ago

pberko commented 3 years ago

Hello @laxris

Hope you can help me

I have a pfa which is "blackbox" so my data is list of paths: if alephbet is "a, b" so my data is a file with list of paths "aaba..., ..., ..." that are generated from the blackbox.

I understood that I can predict the pfa using Alergia algorithm. Is it possible to do using your tool in python?

Thanks

laxris commented 3 years ago

Hi @pberko,

take a look at this tutorial: https://github.com/tudelft-cda-lab/Tutorials/blob/main/FlexFringe/probabilistic_automata.md You want to convert your data into the right input format. It's a textfile. The first line has the number of words, and the number of distinct letters in your alphabet. Then in each line you have one of your input words given as follows: label, length, and each letter space-separated. You set the label to 0, and the length is the number of letter sin the world.

The you run the command from the tutorial to call Alergia.

Cheers

pberko commented 3 years ago

@laxris thank you

I succeded to run the command ./flexfringe --heuristic-name overlap_driven --data-name overlap_data traces2.dat

for the file https://github.com/pberko/detano/blob/master/traces2.dat

but the result looks incorrect: the original pfa is https://github.com/pberko/detano/blob/master/blackbox111.pdf

but the learned is https://github.com/pberko/detano/blob/master/outfile.png

Maybe the label should not be set always to "0"?

What exactly is the purpose of the label? Thank you

laxris commented 3 years ago

The default settings for some parameters are throwing away too much of the data. Can you check which branch and version you're currently using?

If I run ./flexfringe --ini=ini/batch-alergia.ini ~/Downloads/traces2.dat.txt with the following settings in the ini file:

[default] heuristic-name = alergia data-name = alergia_data sinkson = 0 printwhite = 1 printblue = 1 lowerbound=0

it looks pretty alright. Please note that the the counts on the transitions are absolute counts and should be normalized to get probabilities.

On Mon, 26 Jul 2021 at 20:24, pberko @.***> wrote:

@laxris https://github.com/laxris thank you

I succeded to run the command ./flexfringe --heuristic-name overlap_driven --data-name overlap_data traces2.dat

for the file https://github.com/pberko/detano/blob/master/traces2.dat

but the result looks incorrect: the original pfa is https://github.com/pberko/detano/blob/master/blackbox111.pdf

but the learned is https://github.com/pberko/detano/blob/master/outfile.png

Thank you

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/laxris/flexfringe-colab/issues/2#issuecomment-886927133, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6QW4ZSP4DQ25JKJGHLABTTZWR5TANCNFSM5A64XJUQ .

laxris commented 3 years ago

And regarding the label: The label is necessary when learning non-probabilitic state machines (just DFAs not PDFAs). In this case, learning requires counter-examples (i.e. traces not part of the language of the DFA), otherwise learning isn't possible. You can give those 0 (counter-example) and 1 (example) labels. For probabilistic machines, counter-examples are not required, but the input format expects a label. It's an artifact of the input format, but not a requirement for learning.

pberko commented 3 years ago

Thanks @laxris I just download the master branch from https://bitbucket.org/chrshmmmr/dfasat/src/master/

and for some reason when trying to run with ini file I get the error image

also when using directly your ini file: image

laxris commented 3 years ago

Ah yes, please don't use master, use the multivariate branch at https://bitbucket.org/chrshmmmr/dfasat/src/multivariate/

On Tue, 27 Jul 2021 at 10:21, pberko @.***> wrote:

Thanks @laxris https://github.com/laxris I just download the master branch from https://bitbucket.org/chrshmmmr/dfasat/src/master/

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/laxris/flexfringe-colab/issues/2#issuecomment-887313393, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6QW42FSWSPP37VVHQMYB3TZZUAZANCNFSM5A64XJUQ .

pberko commented 3 years ago

@laxris Great! image

how I can get the probability of the edges?

laxris commented 3 years ago

The alergia.ini file has some settings you'd want to turn off in your case. like markovian. To get the probabilities, you just need to normalize: divide the count on the transition by the total sum of counts from all outgoing transitions of the state.

On Tue, 27 Jul 2021 at 12:01, pberko @.***> wrote:

@laxris https://github.com/laxris Great! [image: image] https://user-images.githubusercontent.com/86918539/127135535-a6a77433-9824-4639-b3bc-4b7241575ede.png

how I can get the probability of the edges?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/laxris/flexfringe-colab/issues/2#issuecomment-887380135, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6QW47PQ5RMKWVPSNCBOYDTZZ7WJANCNFSM5A64XJUQ .

pberko commented 3 years ago

thanks a lot

now I see that u mentioned the normalization earlier.. thanks!

laxris commented 3 years ago

You can also take a look at the json output files, they're a bit easier to work with than the dot files.

Let me know if you run into any other issues.