Open amb-enthusiast opened 10 years ago
Hi Ant,
Banish currently uses its own format of data. We were asked to consider at the startup meeting whether a mongodb compatible data format could be used (which I'll have to read up on as I have no prior experience of it). There is currently no requirements for Banish to handle other data formats.
We do however want Banish to be used and to be as useful as possible, so we can certainly add your requirements to a list of potential upgrades for the future.
Paul
Hi Paul,
MongoDB offers a document-oriented store, very closely aligned with JSON format, with indexes operating on document properties to enable rapid search/querying. It is a really exciting technology, and I've been using it on a few projects.
One option would be to store a BN as a MongoDB document, in a form that closely resembles the .net format; something like:
{
modelAuthors : ["Ant" , "Paul"] ,
modelTitle : "MyFirstModel" ,
modelLastEdited : "2014-03-01T13:32:01GMT" ,
modelDescription : "A toy example" ,
modelMetadata : {
accessControls : ["group1" , "group2" ] ,
modelSummaryStats : {
totalNodes : 2 ,
meanInDegree : 0.5
}
} ,
nodes : [
{ name : "A" , values : ["a0" , "a1"]} ,
{ name : "B" , values : [ "b0" , "b1" , "b2"] }
] ,
potentials : [
{
name : "P_A" ,
nodes : [ "A" ] ,
values : [ 0.33 , 0.67 ]
} ,
{
name : "P_B" ,
nodes : [ "B" ] ,
values : [ 0.55 , 0.3 , 0.15 ]
} ,
{
name : "P_A | B" ,
nodes : [ "A" , "B" ] ,
values : [ 0.333 , 0.667 , 0.25 , 0.75 , 0.55 , 0.45 ]
}
]
}
I doubt that this is optimal, but it could be a quick way to support easy conversion between Banish format and widely used existing formats.
Thanks Ant. That's very helpful. I'll definitely produce something along the line you suggest.
Glad I could help!
Hi team,
Just wanted to flag up requirements for file IO features in Banish.
Existing efforts
SAMIAM reads and writes several file formats: http://reasoning.cs.ucla.edu/samiam/iframe/fileformats.html
But, it depends on the smile library to do this. The Genie/Smile project gives details of support for a range of formats: http://genie.sis.pitt.edu/wiki/Elements_of_GeNIe:_File_formats_suported_in_GeNIe
This - somewhat old - project looks promising http://www.kddresearch.org/Groups/Probabilistic-Reasoning/convertor.html
Whereas this looks like another great standards initiative: http://www.cs.cmu.edu/~fgcozman/Research/InterchangeFormat/ Nothing like XML to bloat file size with tag text...
Requirements
I've mostly worked with SAMIAM and Kevin Murphy's BNT to date, but plan on using Genie/Smile. And of course, Banish.
In the case of SAMIAM, I've used the default Hugin file formats (.net and .dat) text based format. I have a set of models that I plan to enrich & refine.
The BNT will load BNIF files http://bayesnet.github.io/bnt/docs/usage.html#file
I have a smaller set of models for BNT, but still plan to improve the models over time. BNT is used for its learning algos that go beyond those offered in SAMIAM.
I guess I would like Banish to handle (read & write) to Hugin, BNIF and in anticipation of future work, the Genie/Smile DSL.
The BBN conversion tool deals with all these cases, and as such, could be a useful starting point.
Does this fit with current plans for Banish?