istreamdata / orientgo

Go (golang) client for OrientDB
MIT License
125 stars 23 forks source link

API Inspiration from jinzhu/gorm #6

Open yanfali opened 9 years ago

yanfali commented 9 years ago

I've been using a lot of jinzhu/gorm lately to interact with SQL databases and I think it would be a good style for interacting with OrientDB.

e.g. gorm uses

db.Model(&Invoice{}).Find(&invoices)

Model specifies the model which maps to a table, and invoices returns the results.

For orientdb, maybe something like:

db.Document(&Invoice{}).Where(Invoice{CustomerID: "15"}).Find(&invoices)

Or later with Graph support:

db.Edge(&Customer{}).Find(&customers)
db.Vertex(&Owns{}).Find(&cars).Preload("customer")

I really like their fluent style API, though error handling takes a little bit of getting used to, where you have to check an Error member after each operation you want to return errors from.

quux00 commented 9 years ago

Thanks yanfali,

I'm not familiar with gorm, so I really appreciate new ideas. I want to do a fairly full analysis before deciding. Two other things on my list to look at are the go neo4j driver API (https://github.com/jmcvetta/neoism is one, see also links at https://github.com/go-cq/cq) and the MongoDB driver API.

Do you have an interest in doing a comparison of all those, as well as the Java OrientDB client API and maybe making a recommendation?

A few quick questions on the gorm model:

Q1:: Would it be better to do:

db.Document("Invoice").Where ...
db.Edge("Customer").Find(&customers)

With the OrientDB binary driver, I have to know the classname, which is a string in OrientDB. I'm not clear how I would get the classname from an empty struct. How does gorm do it?

Q2:: What is the &invoices and &customers a reference to, as in this example:

db.Edge("Customer").Find(&customers)

Is it a slice of Customer structs?

I agree about liking a fluent style, but that can be tricky in Go if you ever need to return errors or multiple values, so understanding the error handling model of the various APIs would also be an important part of the decision.

Any work you want to do on this is welcome. I am now diving into the serialization side of the low level "native" API. That will probably take me a few weeks, since I only have a few hours a day to work on this. After that, I'll be ready to think about more user-friendly APIs and start applying them.

yanfali commented 9 years ago

Hi quux00,

Thanks for taking the time to respond. I will take a look at those projects you mentioned. I haven't used a graph database in anger as yet, so I'm not sure I'm qualified to really assess what would constitute a good graph API but I will definitely try and take look and see what I can fathom.

I have used the mongo from go and it's fairly straight forward, but since it inherently doesn't support relations or transactions it doesn't map as well onto those concepts, i.e. if you build an application that needs relations you are responsible for that in application code; explaining my interest in orientdb.

I did try and look at the orientdb REST API, but couldn't quite understand how it would handle transactions; also parsing the returned JSON seemed like a chore. It seems like it only supported that feature through the batch API. I will take a look at the Java bindings; but one thought I had is maybe the nodejs/javascript bindings might also be a good source of inspiration since they are supposed to be fairly complete and are fluent style; fluent seems to work better in a language that supports exceptions and try catch.

Q1 answer: gorm uses reflection and then normalizes the struct name into the "table" or classname. e.g. HashFile -> hash_file It also supports an explicit struct function called TableName() which returns the string name of the table.

e.g.

type Invoice struct {
    ID int64 `gorm:"primary_key"`
    CustomerID int64 `sql:"not null"`
}

func (i *Invoice) TableName() string {
  return "invoices"
}

Q2 answer:

Yes, &invoices would be a []Invoice{}, that's how gorm knows how to map you table to struct. It uses the struct tag "gorm" to specify things it needs from the struct fields, but also specifies a "sql" one for passing through values to the driver.

I kind of like the use of a struct instead of a string because you get compiler support. You could use string const values and this would definitely be simpler and faster than reflection, but possibly less convenient and harder to enforce programmatically. Since orientdb lets you download the schema you could do check at runtime so it might not be too bad.

On Fluent style in gorm:

Yes, I agree, the fluent style breaks down for errors and multiple values. Some operations trigger errors, some don't. Gorm deals with it by having an explict Error value which can be requested after every "error prone" operation. e.g.

func GetInvoices(id int64) {
    if err := DB.Where(&Customer{ID: id}).Find(&invoices).Error; err != nil {
        return &Invoice{}, err
    }
    return invoices, nil
}

Explicit transactions work in a similar way, by default all updates/creates/deletes will start new transactions on each operation if no transaction is on the current handle:

  tx := DB.Begin()
  if err := tx.Create(&newInvoice).Error; err != nil {
    tx.Rollback()
  }
  if err := tx.Update(&Customer).Error; err != nil {
    tx.Rollback()
  }
  tx.Commit()
quux00 commented 9 years ago

Sorry for the delay. I had to put in extra time on my day job this weekend. I do have a seekable write buffer in place so now I can tackle serialization seriously and get that foundational piece knocked out over the next few weeks.

I don't have a lot to contribute to the API discussion right now. I think you've made a good start. The gorm error handling model works for me.

I do have a bias towards more efficiency and less garbage creation (temporary objects on the heap) whenever it comes to design, so I try to avoid reflection unless it is necessary. So I'm not sold on the

db.Edge(&Customer{})
vs
db.Edge("Customer")

but I'd need to profile it to be sure, but the TableName() idea would solve that problem. So I'd defer to you if you think that is a nice way to do it from an end-user perspective.

Also, a minor point, but why not:

db.Edge(Customer{})

rather than

db.Edge(&Customer{})

The Go complier is probably smart enough to optimize this away, but the former explicitly allows the compiler to keep the empty struct on the stack, whereas when I see &XXX{} I worry that it goes onto the heap and creates more work for the GC.

Feel free to add more here as you go along. Once the serialization piece is done the next things I need to focus on are the end user API and transactions, so you working on this now is great timing.

yanfali commented 9 years ago

That's great news about the serialization!

I agree on the garbage collection. Though in the case of pointer I think that's just a gorm convention. Usually you pass in objects to be persisted as pointer, as that's what gorm's reflection code is expecting.

It certainly would be possible for a dev to provide "const" like versions of those objects so they wouldn't need to be created, and I'm actually not sure whether they are stack or heap allocated by the go compiler if they exist for just the function invocation. I have no preference, and agree memory profiling would be helpful.

I did a short review of neoism and surveyed the orientdb java document API. I want to look at the Node client next.

Also, just for your review I think looking at Google Contexts would be useful for on ideas about how you may want handle timeouts and errors across process boundaries. It's fairly intrusive, but the ideas are universal.

MehSha commented 9 years ago

HI, working on a project based on nodejs (koajs) and orientdb have some thought and questions about inspiration and api. first we use koa mostly because go has no good driver for orientdb (and we use orientdb because handling relations in mongodb is a big pain). so very much like this project succeed.

1- orientjs (nodejs orientdb driver) has a very easy api. very good source of inspiration. but still it lacks many features when dealing with graph, thus we are writing a custom OGM to tackle this. our source of inspiration for this ogm is http://bulbflow.com/. very easy and nice

not sure how much is possible in Go but such a mix of document/graph is fantastic.

2- aren't you going to separate low level api and higher level layers in different projects? thus anyone that wan to focus/fork/contribute the high level api need not think about and study low level bits?

3- when you think a mostly usable high level api will be released (so we start projects with this driver)? and if we are going to contribute, where to start? frankly wonder why this is very much a one man project as golang and orientdb are both rising so there should be more than one man interested in such project (and spare time to contribute!) regards

quux00 commented 9 years ago

Hi MehSha,

Great questions and feedback. Apologies for the delay in responding.

I'm still heads down in the serialization details. I found a bug in the handling of the variable int32 and int64 encoding/decoding. I've fixed that now. Still more serialization types to handle for the next while, so I myself won't be thinking about higher level APIs just yet.

But I welcome you and others in the community to continue to do so. I will definitely look at orientjs and bulbflow when I get to that part of the project.

aren't you going to separate low level api and higher level layers in different projects? thus anyone that wan to focus/fork/contribute the high level api need not think about and study low level bits?

I will consider your proposal on separate projects, but my default preference is to keep it all together. However, your question is definitely in the spirit of my overall goal - I want a function-based low level API on top of which we can build one or more higher level APIs. Perhaps one for Document DBs, one (or more) for Graphs, as the Java side has with the OrientDB Graph API and the various tinkerpop APIs.

Once the low-level API is stable, there is no reason people can't build whatever higher-level API they would like on it.

when you think a mostly usable high level api will be released (so we start projects with this driver)?

I don't have a projection for that, as there are too many unknowns. I can say that I hope to have the serialization pieces done within a few weeks. Once that is in place then it is possible to create records and maybe updates (still have to look at that) from Go data structures.

and if we are going to contribute, where to start?

Good question. Since I have a three day weekend coming up, I will try to organize a list of suggestions of things to do in the near term and post them as issues others could claim.

frankly wonder why this is very much a one man project as golang and orientdb are both rising so there should be more than one man interested in such project (and spare time to contribute!)

Sounds good!

For now, I'd say:

One thing that could definitely help is to read through other OrientDB driver codebases and see how they solved these issues. I'm mostly using the OrientDB Java driver as my guide, but no doubt the nodejs or python or other drivers have things we can learn from.