cs3org / reva

WebDAV/gRPC/HTTP high performance server to link high level clients to storage backends
https://reva.link
Apache License 2.0
167 stars 113 forks source link

Questions about REVA #199

Open butonic opened 5 years ago

butonic commented 5 years ago

The goal is to scale development

What is reva?

How to do that?

Further benefits

Drawbacks

Alternatives

labkode commented 5 years ago

The goal is to scale development

  • onboard new developers (golang is young, few good people available)
  • make learning reva easier

One of the most effective approaches is to write documentation, as the current one has deprecated sections, writing good documentation is the easiest why to onboard people. The technologies behind are understood by the golang community, only the REVA connection between services is needed to explain clearly how the components interact between each other. A white-paper with an in-deep analysis will be published soon.

What is reva?

  • Reference implementation for CS3 API, or
  • A framework to implement CS3 services?

REVA is an open and interoperable platform to connect Cloud Storages and Application Providers. It is the first platform that implements the CS3 APIs, and therefore will be the reference implementation for those. REVA will empower running different CS3 services as a standalone platform or by integrating it into commercial providers ecosystems (Nexus, in case of ownCloud).

How to do that?

  • use well known frameworks and best practices, e.g.

    • micro or kit (come with metrics and tracing)

REVA heavily relies on the standard framework, a.k.a the Go standard library. Another framework that REVA heavily depends on is gRPC. gRPC is the core protocol of REVA, a building block for the framework-over-frameworks you mentioned, is expected that more people know about gRPC than any of those frameworks.

  • make reva only one thing: the framework used to implement CS3 services

See my definition of what REVA is in previous section.

  • create new repositories for different services (move away from monorepo)

Core services will live in REVA as part of the core platform, it was agreed in previous face-to-face meetings that ownCloud specific services will live in owncloud organization and the reason they are currently in the cs3org/reva repository as of today is for convenience until the platform reaches a stable status and ownCloud has setup the necessary internal processes to build Nexus on top of REVA ( adding the necessary services to the core services, for example, ocssvc, ocdav, ...).

  • use existing service registries (compare the current toml based config to docker compose files, they are both for orchestration)

Despite the name we use in REVA for internal packages named registries, they are level 7 service discovery mechanisms with knowledge of application specific constraints. Another difference is that REVA components don't need any external components. Docker-compose or other orchestration engines are complementary to REVA, not a replacement.

Is already the case.

Further benefits

  • battle hardened frameworks that are used in production

I agree, that is why REVA uses the most robust framework in Go, the standard go library, and the GRPC framework (both used in production at scale).

  • avoid not invented here syndrome
  • stop wasting time writing existing functionality

I agree, that's why REVA relies heavily on existing libraries and frameworks for complex logic.

  • services can progress independently

It will be the case when non-core services live in different repositories (like ocdav, ocssvc and other that will follow belonging to different organisations/companies).

  • keep dependencies minimal per services

REVA strikes for minimum dependencies, any new direct dependency is evaluated before being committed to the repo, with license checks included to ensure the free adoption of REVA across the community is not blocked with a third-party dependency incompatible with the allowed list of REVA licenses.

  • reva as the core framework can keep dependencies minimal

I can't agree more.

Drawbacks

  • CS3 api changes force an update of all services

    • protobuf is versioned
    • when reaching v1 we need to move away from monorepo anyway

This applies not only to CS3 APIs neither GRPC, is a general problem of protocols, that is why CS3 APIs are versioned following Google and Uber best-practices for GRPC service definitions, to ensure only major changes in the API (v1 to v2) require an update of all the services to consume the new API version.

  • quality of dependencies might be sub par

    • umm there are no tests in reva, yet

As REVA is probably reaching a stable phase before end of this year and major changes are not exposed on the definition of the API neither on the internal working structure of the components, a task has been created to start retrofitting tests, starting with internal packages: https://github.com/cs3org/reva/issues/18

Alternatives

  • reuse reva services in separate repos

    • http services are instantiated witgh New, http.Handler interface can be used with any mux
    • grpc services are registered with a grpc.Server in New() as well

That will be the case when ownCloud components for example, will live in owncloud github organization and will rely on REVA for instantiating them as is explained in the internal document shared between ownCloud and CERN on how to achieve this.

tboerger commented 5 years ago

One of the most effective approaches is to write documentation, as the current one has deprecated sections, writing good documentation is the easiest why to onboard people. The technologies behind are understood by the golang community, only the REVA connection between services is needed to explain clearly how the components interact between each other. A white-paper with an in-deep analysis will be published soon.

But if we are using well-known libraries like go-kit, cobra and viper it will even lower the barrier because these tools are so widley used that most Go developers already know enough about them, without the need to understand all the Reva boilerplate code that gets reinvented within Reva.

REVA is an open and interoperable platform to connect Cloud Storages and Application Providers. It is the first platform that implements the CS3 APIs, and therefore will be the reference implementation for those. REVA will empower running different CS3 services as a standalone platform or by integrating it into commercial providers ecosystems (Nexus, in case of ownCloud).

That sounds totally fine.

REVA heavily relies on the standard framework, a.k.a the Go standard library. Another framework that REVA heavily depends on is gRPC. gRPC is the core protocol of REVA, a building block for the framework-over-frameworks you mentioned, is expected that more people know about gRPC than any of those frameworks.

Sure, it's using the standard library and gRPC, but there are various parts where the wheel gets reinvented as all this functionality is already provided in a battle-tested way by kit or micro. Thsi would reduce the hand-written boilerplate code.

Core services will live in REVA as part of the core platform, it was agreed in previous face-to-face meetings that ownCloud specific services will live in owncloud organization and the reason they are currently in the cs3org/reva repository as of today is for convenience until the platform reaches a stable status and ownCloud has setup the necessary internal processes to build Nexus on top of REVA ( adding the necessary services to the core services, for example, ocssvc, ocdav, ...).

Even if the core services are kept within this repo it's still a monorepo. At least with ownCloud we have seen that it's not always good to keep all in a single repository, we have split more and more apps out of the ownCloud repository into dedicated repositories.

Despite the name we use in REVA for internal packages named registries, they are level 7 service discovery mechanisms with knowledge of application specific constraints. Another difference is that REVA components don't need any external components. Docker-compose or other orchestration engines are complementary to REVA, not a replacement.

But so far the revad.toml feels more like a service discovery than a plain configuration.

I agree, that's why REVA relies heavily on existing libraries and frameworks for complex logic.

But we are still wasting time with writing opencensus integrations and other things which had already been done by frameworks like kit or micro.

REVA strikes for minimum dependencies, any new direct dependency is evaluated before being committed to the repo, with license checks included to ensure the free adoption of REVA across the community is not blocked with a third-party dependency incompatible with the allowed list of REVA licenses.

Reva got a minimal set of dependencies, but with the monorepo approach there is no way to integrate more depndencies for other services as it always affects the whole repository.

As REVA is probably reaching a stable phase before end of this year and major changes are not exposed on the definition of the API neither on the internal working structure of the components, a task has been created to start retrofitting tests, starting with internal packages:

18

Tests should not be intrduced when it's reaching a stable release, such an important kind of software should be covered by unit tests from the beginning, otherwise it could lead to architectures which are really hard to test because nobody ever thought about some kind of abstraction to be able to mock some parts of the application to get proper unit tests running. I'm not talking about acceptance tests, that's a totally different story, but at least it should include some unit tests. Otherwise it's always hard to do proper refactorings as you got no indicator beside heavy manual testing if a refactoring is successful or fails. So on this part I can't agree at all.

butonic commented 5 years ago

We are currently planning this in https://github.com/owncloud/ocis

labkode commented 5 years ago

One of the most effective approaches is to write documentation, as the current one has deprecated sections, writing good documentation is the easiest why to onboard people. The technologies behind are understood by the golang community, only the REVA connection between services is needed to explain clearly how the components interact between each other. A white-paper with an in-deep analysis will be published soon.

But if we are using well-known libraries like go-kit, cobra and viper it will even lower the barrier because these tools are so widley used that most Go developers already know enough about them, without the need to understand all the Reva boilerplate code that gets reinvented within Reva.

If you think that using cobra or viper can help onboarding new people, please feel free to submit a PR for the cmd/reva cli, we'd be happy to review it.

REVA is an open and interoperable platform to connect Cloud Storages and Application Providers. It is the first platform that implements the CS3 APIs, and therefore will be the reference implementation for those. REVA will empower running different CS3 services as a standalone platform or by integrating it into commercial providers ecosystems (Nexus, in case of ownCloud).

That sounds totally fine.

REVA heavily relies on the standard framework, a.k.a the Go standard library. Another framework that REVA heavily depends on is gRPC. gRPC is the core protocol of REVA, a building block for the framework-over-frameworks you mentioned, is expected that more people know about gRPC than any of those frameworks.

Sure, it's using the standard library and gRPC, but there are various parts where the wheel gets reinvented as all this functionality is already provided in a battle-tested way by kit or micro. This would reduce the hand-written boilerplate code.

Can I kindly ask you to be more specific about your statements about what gets reinvented? With general statements I feel the feature creep syndrome here.

Core services will live in REVA as part of the core platform, it was agreed in previous face-to-face meetings that ownCloud specific services will live in owncloud organization and the reason they are currently in the cs3org/reva repository as of today is for convenience until the platform reaches a stable status and ownCloud has setup the necessary internal processes to build Nexus on top of REVA ( adding the necessary services to the core services, for example, ocssvc, ocdav, ...).

Even if the core services are kept within this repo it's still a monorepo. At least with ownCloud we have seen that it's not always good to keep all in a single repository, we have split more and more apps out of the ownCloud repository into dedicated repositories.

REVA will be implement the core services which you can consult here: https://cs3org.github.io/cs3apis/, the rest will go your organization (owncloud dependant services) and to us (CERN specific services). REVA will implement only those.

Despite the name we use in REVA for internal packages named registries, they are level 7 service discovery mechanisms with knowledge of application specific constraints. Another difference is that REVA components don't need any external components. Docker-compose or other orchestration engines are complementary to REVA, not a replacement.

But so far the revad.toml feels more like a service discovery than a plain configuration.

REVA follows the same philosophy of configuration management as NGINX does, feeling is not a reason enough to depend on third-party orchestration engines for the core project.

I agree, that's why REVA relies heavily on existing libraries and frameworks for complex logic.

But we are still wasting time with writing opencensus integrations and other things which had already been done by frameworks like kit or micro.

The time spent on the opencensus integration is close to nothing, the code that mentions opencensus is to wire traces ids to the logging system for helping the support team on error debugging and time spent of the configuration of the backend (which both of these need to be done also in framework XYZ), so please be more specific and point to aspects in the code that could be off-loaded by third-party libraries and the cost to change is worth it.

REVA strikes for minimum dependencies, any new direct dependency is evaluated before being committed to the repo, with license checks included to ensure the free adoption of REVA across the community is not blocked with a third-party dependency incompatible with the allowed list of REVA licenses.

Reva got a minimal set of dependencies, but with the monorepo approach there is no way to integrate more dependencies for other services as it always affects the whole repository.

As REVA is probably reaching a stable phase before end of this year and major changes are not exposed on the definition of the API neither on the internal working structure of the components, a task has been created to start retrofitting tests, starting with internal packages:

18

Tests should not be intrduced when it's reaching a stable release, such an important kind of software should be covered by unit tests from the beginning, otherwise it could lead to architectures which are really hard to test because nobody ever thought about some kind of abstraction to be able to mock some parts of the application to get proper unit tests running. I'm not talking about acceptance tests, that's a totally different story, but at least it should include some unit tests. Otherwise it's always hard to do proper refactorings as you got no indicator beside heavy manual testing if a refactoring is successful or fails. So on this part I can't agree at all.

I understand your concerns Thomas, however the main objective for REVA/Nexus is to provide a minium viable product in a very tight deadline for your customer. Jorn (ownCloud) and me (CERN) have been working together to bring the MVP to
life as fast as we can, and part of this exercise involves taking the risk of not having a well-tested system. If we were focused on having a 100% tested system, REVA will not be able to offer the viability product as of today. Any help on fixing the existing issues or helping Jorn and me will be very much appreciated, by complaining about the already acknowledged problems we don't gain anything.