luci / gae

A Google AppEngine SDK wrapper
62 stars 16 forks source link

Lack of documentation #56

Open jimbojetlag opened 8 years ago

jimbojetlag commented 8 years ago

This is a great repo, this is what GAE Datastore for Go should have been in the first place.

Unfortunately it has no documentation. Yes, I can read the code and see the helpful comments, but there is a lack of documentation for things like how to use the gae tag.

riannucci commented 8 years ago

I agree the documentation could be better, though there is a fair amount if you look at the godoc.org rendering of the go documentation. Definitely needs sprucing up.

riannucci commented 8 years ago

I think some specific things to do would be:

jimbojetlag commented 8 years ago

Thanks @riannucci, I think I'm a bit lost here: what is the relationship between luci/gae, luci/luci-go and luci/luci-py?

If I'm interested in building a web app with a rest api, authentication, using luci/gae data modeling facilities, do I need to use all three luci projects?

danjacques commented 8 years ago

I think I'm a bit lost here: what is the relationship between luci/gae, luci/luci-go and luci/luci-py?

luci/gae is a largely-independent package that focuses almost entirely on Google AppEngine and Cloud Services support. It does import a few useful support packages from luci/luci-go, but not any of its application code.

None of this requires or uses code fromluci/luci-py.

If I'm interested in building a web app with a rest api, authentication, using luci/gae data modeling facilities, do I need to use all three luci projects?

Only luci/gae and its handful of luci/luci-go common package dependencies.

jimbojetlag commented 8 years ago

@danjacques thanks, when using luci/gae for an app engine classic app, should I simply run the dev server by goapp as usual?

The only example I could find is https://github.com/luci/luci-go/tree/master/examples/appengine/helloworld_standard which seems to use ./gae.py devserver instead of goapp.

danjacques commented 8 years ago

gae.py is our (LUCI project) deployment script, which wraps goapp, gcloud, and/or appcfg.py as appropriate to add some additional accounting that we use. There is definitely no requirement for you to use that script!

luci/gae's AppEngine support is built on top of the standard AppEngine packages, so any AppEngine app built with luci/gae is a completely standard AppEngine app and will work just fine with dev_appserver.py.

One real perk of luci/gae is that you can swap out backends and your code will still work. For example, when testing, you can replace the AppEngine backend with an in-memory datastore-compatible backend, increasing test execution speed by several magnitudes over aetest-style testing. In this case, you would be using the impl/memory backend instead of the impl/prod backend, but this is for tests, not code that you deploy or run with dev_appserver.py.

The documentation could definitely use some improvement, and it looks like your post inspired riannucci@ to do just that. In the meantime, here is a simple hello world for luci/gae that I just threw together:

package init

import (
        "fmt"
        "net/http"
        "strconv"

        "github.com/luci/gae/impl/prod"
        "github.com/luci/gae/service/datastore"

        "golang.org/x/net/context"
)

func init() {
        http.HandleFunc("/", hello)
}

type MyThing struct {
        ID    int64 `gae:"$id"`
        Value string
}

func hello(w http.ResponseWriter, r *http.Request) {
        c := prod.Use(context.Background(), r)
        runHello(c, w)
}

func runHello(c context.Context, w http.ResponseWriter) {
        s := &MyThing{Value: "Hello World"}
        if err := datastore.Get(c).Put(s); err != nil {
            w.WriteHeader(http.StatusInternalServerError)
            return
        }

        w.Header().Add("x-result-id", strconv.FormatInt(s.ID, 10))
        fmt.Fprintln(w, "Wrote Hello World struct, got ID %v!", s.ID)
}

Here is an example of it being tested:

package init

import (
        "net/http"
        "net/http/httptest"
        "testing"

        "github.com/luci/gae/impl/memory"

        "golang.org/x/net/context"
)

func TestRunHello(t *testing.T) {
        c := memory.Use(context.Background())
        var rec httptest.ResponseRecorder
        runHello(c, &rec)

        if rec.Code != http.StatusOK {
            t.Fatalf("unexpected code: %d", rec.Code)
        }
        resultID := rec.Header().Get("x-result-id")
        t.Logf("Got ID: %v", resultID)
        if resultID == "0" {
            t.Fatalf("unexpected zero ID")
        }
}

Notice the different backends, impl/prod and impl/memory, being used in between tests. Nothing is stopping you from installing impl/prod in an aetest-style test either!

Hope this helps.

jimbojetlag commented 8 years ago

Great, thanks for making things clear.

A couple of more questions:

danjacques commented 8 years ago

Is prpc not meant to be restful-friendly? If I'm correct, it seems like everything is handled by http POST, even if I want to GET a resource, it will send a POST request.

That is correct: pRPC is definitely not a REST-ful RPC mechanism. It's an HTTP-centered RPC mechanism meant to parallel the technology behind gRPC, which (as of now) does not work on AppEngine.

It seems like luci-go supports app engine and openid authentications out of the box. Openid is deprecated, and app engine user only supports Google accounts. Does luci-go also work with Google Identity Kit (now Firebase authentications) based on JWT?

Most of the work on luci-go's auth package has been done by @vadimsht ; hopefully he can provide some more info here on the current capabilities and intended direction of the package.

bsr203 commented 8 years ago

Hi Daniel,

I am also just starting to use this package and your response found quite useful. I am mainly going to use luci for in memory testing facility. I have an existing app, and have doubt on the following API use

prod.Use(context.Background(), r)

I typically construct the context like

ctx := appengine.NewContext(r)
nctx, err := appengine.Namespace(ctx, ns)
// use namespaced context for datastore/memcache/.. access

so am I supposed to use it like

prod.Use(nctx, r)

I didn't look further but why need a second context here, if user already passes a valid (at least in production, user should be using a appengine context ) context.

thanks, bsr

bsr203 commented 8 years ago

few more quick qn:

  1. release note says context.Context only needs to be passed once per function body to obtain e.g. a datastore.Interface, instead of once per API call.. I understand second part, that appengine API needs a valid context so we pass them in each. Does the first part mean, use gae.GetDS(c) to get a type which has all the service API (eg) defined. But, I don't see GetDS defined in the source or doc.
  2. How would you set the fixture/pre conditions in the datastore. say, I need to set a user and few entities (say type Group) in the datastore before I can test the update (Group) functionality. Do you just mock the response instead of setting it? how would I mock a datastore response?

thanks again. bsr.

vadimsht commented 8 years ago

It seems like luci-go supports app engine and openid authentications out of the box. Openid is deprecated, and app engine user only supports Google accounts. Does luci-go also work with Google Identity Kit (now Firebase authentications) based on JWT

luci-go uses OpenID Connect protocol, I believe it isn't deprecated.

It shouldn't be hard to add Identity Kit integration (by implementing these two interfaces).

One issue with using luci-go auth stack, is that it is somewhat more complicated than needed for standalone apps, since it uses LUCI Auth service for user groups, IP whitelists and stuff like that. (It makes sense to use central place for this when you have >2 GAE apps that need them).

In particular, a luci-go app would complain that it needs a link to an Auth Service when deployed.

It is relatively simple to write a trivial local implementation of authdb.DB interface that doesn't depend on LUCI Auth Service though. If you are still interested in using luci-go in your project, I can add it.

danjacques commented 8 years ago

I am also just starting to use this package and your response found quite useful. I am mainly going to use luci for in memory testing facility.

Awesome! That's a huge reason why the package exists.

I have an existing app, and have doubt on the following API use...

So I believe my usage in the example is correct. As you can see in the Use source code, it actually calls appengine.NewContext immediately.

As a general rule of thumb, if luci/gae exposes a version of an AppEngine API, you need to use it instead.

For example, when using luci/gae, you must use the luci/gae Info service APIs to manipulate the namespace, rather than the direct AppEngine API. The reason for this is because the luci/gae versions of those services track additional data. For example, the luci/gae memcache and datastore services both use the info service's namespace functions to know what namespace they're currently in. If you set it directly using AppEngine API, the info service will have the wrong answer, and the other luci/gae services will be operating on the wrong namespace.

Your namespace setup would look like:

import (
  "github.com/luci/gae/impl/prod"
  "github.com/luci/gae/service/info"
)

// ...

ctx := prod.Use(context.Background(), r)
nctx, err := info.Namespace(ctx, ns)
// use namespaced context for datastore/memcache/.. access

release note says context.Context only needs to be passed once per function body to obtain e.g. a datastore.Interface, instead of once per API call.. I understand second part, that appengine API needs a valid context so we pass them in each. Does the first part mean, use gae.GetDS(c) to get a type which has all the service API (eg) defined. But, I don't see GetDS defined in the source or doc.

The current API has you supply a Context when you retrieve a Datastore (or other service) interface. After that, the Context that you use is bound in the returned instance, so you don't need to supply additional Context for each API call.

import "github.com/luci/gae/service/datastore"

dsInst := datastore.Get(ctx)

In this case, all of dsInst's methods don't need a Context passed explicitly to them. This is similarly true for other service/... packages.

We actually have an issue open ( #53 ) to un-do this and make it so the Context is passed once for each function call, both for consistency and so cancellation is simpler.

How would you set the fixture/pre conditions in the datastore. say, I need to set a user and few entities (say type Group) in the datastore before I can test the update (Group) functionality. Do you just mock the response instead of setting it? how would I mock a datastore response?

In testing, one sets up preconditions by directly installing them into the testing datastore. You don't need to mock the datastore response because the datastore is actually in the state that you want to begin testing with.

In your example, you would set up the users/groups in the test code, then run the function under test and assert the success via datastore queries. Something like (note this is not runnable, just an idea):

func TestUpdateGroup(t *testing.T) {
  // Set up `impl/memory` testing Context.
  // Install precondition Users.
  userA := &User{...}
  userB := &User{...}
  if err := datastore.Get(ctx).Put(userA, userB); err != nil {
    t.Fatalf(...)
  }

  // Install precondition Group.
  group := &Group{Users: []*User{userA, userB}}
  if err := datastore.Get(ctx).Put(group); err != nil {
    t.Fatalf(...)
  }

  // Run the test.
  testUpdateGroups(ctx, ...)

  // Assert the group was updated.
  datastore.Get(ctx).Get(group)
  if group.LastUpdateTime.Before(now) {
    t.Fatalf("group didn't update")
  }
}

In the LUCI project, we use the goconvey test framework, which makes it relatively easy to set up common precondition states in datastore and then perform a bunch of different mutations/tests on those states. Any other test framework will work, though, so pick whatever works for you.

Hope this helps!

bsr203 commented 8 years ago

Hi Daniel, Thanks you so much for the detailed explanation. I wanted to respond after I try out, but was bit busy. I will be trying to use it tomorrow. thanks again, bsr.

jimbojetlag commented 8 years ago

@vadimsht thanks you for the thorough explanation of auth.

since it uses LUCI Auth service for user groups, IP whitelists and stuff like that...

... It is relatively simple to write a trivial local implementation of authdb.DB interface that doesn't depend on LUCI Auth Service though. If you are still interested in using luci-go in your project, I can add it.

By user groups, do you mean App Engine specific ACL, or general user roles that can be used in any project? If it is the former, then yes, it'd be great if you could add local implementation of authdb.DB so that luci-go auth can be beneficial for outside projects, including ours, too.

vadimsht commented 8 years ago

By user groups, do you mean App Engine specific ACL, or general user roles that can be used in any project?

General user roles.

They way it work now, one can setup an instance of auth_service (as separate GAE app) and tell a luci-go app to use it (by visiting /admin/settings and following the instructions). Auth Service exposes a UI for group management, like this one: u1nxre17wje

Then luci-go GAE apps can use auth.IsGroupMember call to make group membership checks. And many different GAE apps can use the exact same groups. (Which is a must for us, since we run a bunch of them in a "microservices" model).

The problem now, however, is that this link is required for all luci-go apps that use auth system. I was proposing to add a trivial local implementation of authdb.DB (e.g. one that returns false for all IsGroupMember checks) to remove this mandatory dependency on auth_service. It would make IsGroupMember useless, of course, but the rest of the stack would still work.

Adding a fully functional local version of group management UI is much more time consuming task and not a high priority for me (because we do have auth_service already).

vadimsht commented 8 years ago

Oh, also the current implementation of the groups imposes some limits. Roughly, the total size of groups database should be <10 MB (because it is kept fully in-memory to make IsGroupMember fast, it is called very often). It is good enough for infrastructure-style apps (where total number of users of the system is <10000), but probably won't work if groups are used for end-users from the internet.

jimbojetlag commented 8 years ago

@vadimsht Wow that's impressive. As you mentioned it has the downside of being dependent on another GAE app.

I can see some overlapping in your admin dashboard with Google Endpoints v2, as discussed with @riannucci in a luci-go issue. OAuth and IP-whitelist are covered by Endpoints v2. More stuff like rate control are coming, and the dashboard that you implemented is/will effectively be in Google Cloud Console, where you can manage client ids, etc.

@riannucci this is one of the reasons for connecting prpc to Endpoints v2. But we have that REST vs RPC issue. It might be possible to connect prpc to Endpoints v2 without REST though...

jimbojetlag commented 8 years ago

@riannucci Is there a supported way of performing data validation when saving entities in luci/gae?

I was not able to detect this is luci-go source code (maybe I missed it), and still waiting for the documentation.

danjacques commented 8 years ago

Not @riannucci , but the way to validate data is to implement the PropertyLoadSaver interfaces and validate on Load/Save. The implementation can be simple, falling through to GetPLS:

// Assert that we implement the interface.
var _ datastore.PropertyMap = (*MyType)(nil)

func (t *MyType) Load(pm datastore.PropertyMap) error {
  if err := datastore.GetPLS(t).Load(pm); err != nil {
    return err
  }  
  return t.validate()
}

func (t *MyType) Save(withMeta bool) (datastore.PropertyMap, error) {
  if err := t.validate() err != nil {
    return nil, err
  }
  return datastore.GetPLS(t).Save(withMeta)
}
jimbojetlag commented 8 years ago

Thanks @danjacques! A couple of questions:

  1. Let's say we need a validator for checking the min and max of a string. How can a PropertyLoadSaver take a parameter this way? I know the min and max can be stored in MyType struct, but how can they be set per entity kind?
  2. Are there some common reusable validators available in luci/gae, or luci-go?
danjacques commented 8 years ago

Let's say we need a validator for checking the min and max of a string. How can a PropertyLoadSaver take a parameter this way? I know the min and max can be stored in MyType struct, but how can they be set per entity kind?

Maybe we're thinking of different types of validation here. As with other datastore implementations, you will want to validate your data as much as possible elsewhere before you tell datastore to store it via Put. Validating on Load/Save is more computationally intensive and obscure than validating outside of datastore-related operations. The model of "good data in, good data out" fits most of the time.

The sort of validation that I was referencing in my previous comment is more appropriately employed as protection against schema changes than as a primary means of validating input data.

Are there some common reusable validators available in luci/gae, or luci-go?

Not really. Validation seemed custom enough that we left it to the implementer.

jimbojetlag commented 8 years ago

Maybe we're thinking of different types of validation here.

Sorry to be unclear, I was referring to user input validation. Some data access layers have this integrated, I was wondering if this is the case in lucy/gae.

danjacques commented 8 years ago

Ah I see - so confirming: there is no specific user input validation provision built into luci/gae.

The example that I gave fits more into the layer of database schema constraint enforcement. It's there, and it works, but it's not as efficient or effective as explicitly validating the data before engaging the data layer.