biolab / orange3

🍊 :bar_chart: :bulb: Orange: Interactive data analysis
https://orangedatamining.com
Other
4.85k stars 1.01k forks source link

Bayes net structural learning #4972

Closed GottM closed 4 years ago

GottM commented 4 years ago

Hello, This post is not about an issue but about a proposed enhancement.

I would like to suggest a new widget running bayes net structural learning on at least categorical data inputs running algorithms like [bnlearn] and [networkD3]. The main outputs of the new widget would be a "force network" showing a learned model of how the input features correlate (without directed edges thus no causal preconception) and the log-likelihood measuring the quality of the Bayesian model. One would opt one out of several optimization algorithms: in particular the very efficient "Tabu search" and "Chow-Liu".

This widget would enable a quick overview and understanding of wide datasets (many features), possibly helping further causal hypothesis modeling and enable "dimensionality" reduction for a given target feature predictive modeling.

http://www.bnlearn.com/ http://christophergandrud.github.io/networkD3/ https://paulgovan.shinyapps.io/BayesianNetwork/

Best regards, G.

janezd commented 4 years ago

We have discussed this at today's meeting. This belongs to the would-be-nice-to-have category, but it is quite a project and nobody in the core group is currently interested or available. Would you consider doing it?

For the time being, I'm moving this to #4090, where we collect such ideas.

GottM commented 4 years ago

 Hello Janez, As you can read below, Paul Govan shares our enthusiasm and would be willing to join forces with Biolab to extend it in the field of Bayes nets and notably Chow Liu trees.

Yet, I think he is mainly using R and would need transcription support into Python if Python is the only coding option for Orange Biolab.

Would you think we can find a way forward to bring Paul in, maybe with limited support from some Python experts in your teams?

Best regards, Gottfried

Le 13 sept. 2020 à 00:54, Paul Govan pgovan1@aggienetwork.com a écrit :  Hi Gottfried,

That sounds like a good idea, but it looks like biolab is written in python, and I don't have any experience with it. Unless biolab can accept widgets written in different languages or someone is available to translate it, I don't think I can be of much help.

Thanks, Paul

On Fri, Sep 11, 2020, 4:26 PM Gottfried Mathurin gottfried.mathurin@gmail.com wrote: Hello Paul, As you can read in the enclosed message, Orange Biolab has discussed the idea of creating a new widget for structural learning based on Chow Liu trees or Tabu search. I already suggested to them to visit your site and test your online software, because I liked it so much and actually inspired my recommendation for a new Orange widget.

Now, they are asking whether I’d like to develop it myself, but that’s definitely beyond my skills even though I’d love to.

What about you? Would you catch that ball? It would be so great !

Best regards, Gottfried

Début du message transféré :

Expéditeur: Janez Demšar notifications@github.com Date: 11 septembre 2020 à 10:51:34 UTC+2 Destinataire: biolab/orange3 orange3@noreply.github.com Cc: GottM gottfried.mathurin@gmail.com, Author author@noreply.github.com Objet: Rép. : [biolab/orange3] Bayes net structural learning (#4972) Répondre à: biolab/orange3 reply@reply.github.com

 We have discussed this at today's meeting. This belongs to the would-be-nice-to-have category, but it is quite a project and nobody in the core group is currently interested or available. Would you consider doing it?

For the time being, I'm moving this to #4090, where we collect such ideas.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

janezd commented 4 years ago

Including code in R is not an option because it would require the user to have R installed. Plus, there would be an additional dependency, the module for interaction between R and Python; such dependencies may be nightmare to maintain. Finally, transferring data back and forth is not very inefficient. Having R code would only be possible if this would be an add-on, but even then I would warn about maintenance problems.

If there is (or there would be) a reasonable-quality bn code in Python and if the widget would be simple, that is, without any visualization (note that Orange has an add-on that includes network visualization), just various radio buttons and combo boxes, we can help. But implementation (or translation) of bayesian networks in Python or implementing complex visualization ... would require too much effort.