typedb / typedb

TypeDB: the power of programming, in your database
https://typedb.com
Mozilla Public License 2.0
3.87k stars 340 forks source link

Macro for @graknlabs_grakn_core dependencies, to be loaded by other workspaces #4867

Open lolski opened 5 years ago

haikalpribadi commented 5 years ago

This should be the content of the macro: https://github.com/graknlabs/client-python/blob/master/WORKSPACE#L46

lolski commented 5 years ago

@haikalpribadi @vmax

The original idea was to create a macro containing grakn-core's dependency in a .bzl file:

def graknlabs_grakn_core_dependencies():
    ....
    load("@graknlabs_grakn//dependencies/maven:dependencies.bzl", maven_dependencies_for_build = "maven_dependencies")
    maven_dependencies_for_build()
    load("@graknlabs_grakn//dependencies/compilers:dependencies.bzl", "antlr_dependencies")
    antlr_dependencies()
    ...

which can be loaded from other repositories such as docs or client-python like this:

# client-python's WORKSPACE

workspace(name = "graknlabs_client_python")

git_repository(
    name = "graknlabs_grakn_core",
    remote = "https://github.com/graknlabs/grakn",
    commit = "6f3436f6cd73004f3faa1668a3721aa25deda2f0"
)

load("@graknlabs_grakn_core//:dependencies.bzl", "graknlabs_grakn_core_dependencies")
graknlabs_grakn_core_dependencies()

This is apparently not possible because load() can't be used inside a macro (https://github.com/bazelbuild/bazel/issues/1550). Skimming through the issue shows that many people are asking for it. However the Bazel team is reluctant to implement it as it brings in complexity.

haikalpribadi commented 5 years ago

I think we need to take inspiration from bazel-deps, @lolski. We can see in //dependencies/maven/dependencies.bzl that it generates the list of dependencies through one final function:

def maven_dependencies(callback = jar_artifact_callback):
    for hash in list_dependencies():
        callback(hash)

It then gets called in the WORKSPACE file by:

load("//dependencies/maven:dependencies.bzl", "maven_dependencies")
maven_dependencies()

So let's look at how maven_dependencies are able to produce the list of dependencies through one function, and copy that.

@vmax do you know how it works by looking at the code?

haikalpribadi commented 5 years ago

I feel like this will be quite a bit of work. Do we need it for 1.5 release, @lolski @vmax ? Otherwise, we can work on it after we release 1.5.

lolski commented 5 years ago

@haikalpribadi

Me and @vmax had a brief discussion and think we can still improve readability by quite a lot simply by improving the structure of the WORKSPACE file. The structure should be standardised where you have a "section" for each direct dependencies.

For example, for client-java which directly depends on Maven dependencies, Grakn Core and Graql, the WORKSPACE file would be structured like this:

workspace(name = "graknlabs_client_java")

######################################################
# maven dependencies
######################################################
load("//dependencies/maven:dependencies.bzl", "maven_dependencies")
maven_dependencies()

######################################################
# grakn-core and their transitive deps
######################################################
# grakn-core
load("//dependencies/graknlabs:dependencies.bzl", "graknlabs_grakn_core")
graknlabs_grakn_core()

# grakn-core's transitive dependencies
load("//dependencies/graknlabs:dependencies.bzl", "graknlabs_build_tools")
graknlabs_build_tools()
load("@graknlabs_build_tools//grpc:dependencies.bzl", "grpc_dependencies")
grpc_dependencies()
...

######################################################
# graql and their transitive deps
######################################################
load("@graknlabs_grakn_core//dependencies/graknlabs:dependencies.bzl", "graknlabs_graql")
# graql
graknlabs_graql()

# graql's transitive dependencies
load("@graknlabs_graql//dependencies/compilers:dependencies.bzl", "antlr_dependencies")
antlr_dependencies()
...
haikalpribadi commented 5 years ago

Okay. How far is it from "ideal" right now? But most importantly: do we even still need or want to create a macro for dependencies of a given repo to be easily loadable in another repo? @vmax @lolski

lolski commented 5 years ago

@haikalpribadi There is a technical limitation where a Bazel macro can't call another macro. Without that ability, there's no way the macro solution can be implemented. Right @vmax ?

As of now, each Bazel project have a slightly different structure. It's making it hard for the team to understand and being able to modify it with confidence.

haikalpribadi commented 5 years ago

Okay then I'd say lets put this on the back burner for now, @lolski @vmax. We can tidy up the organisation of the dependency declaration in the WORKSPACE file as we go, but the full solution would be when we can create a macro for every repo's set of dependencies. I'm sure there must be a way, but it's not urgent for now.

lolski commented 5 years ago

Sounds good to me

haikalpribadi commented 5 years ago

I'm not sure if we have ever covered this, but it seems like @stackb_rules_proto are doing exactly what we're trying to achieve in this issue?

Let's say we way to use java_grpc_compile, then they allow us to load the dependencies of their workspace easily by doing:

load("@stackb_rules_proto//java:deps.bzl", "java_grpc_compile")
java_grpc_compile()

And when you look into @stackb_rules_proto//java:deps.bzl, you can see that java_grpc_compile() is just a macro that loads external workspaces. Ultimately, they all get loaded from @stackb_rules_proto//:deps.bzl.

Is there anything we can learn from their technique, @vmax @lolski ?

haikalpribadi commented 5 years ago

I tried implementing this strategy for @graknlabs_protocol, and declared all the dependencies in //:deps.bzl.

load("@graknlabs_build_tools//grpc:dependencies.bzl", "grpc_dependencies")
load("@com_github_grpc_grpc//bazel:grpc_deps.bzl", "grpc_deps")
load("@stackb_rules_proto//java:deps.bzl", "java_grpc_compile")

def graknlabs_protocol_deps(**kwargs):
    grpc_dependencies(**kwargs)
    grpc_deps(**kwargs)
    java_grpc_compile(**kwargs)

When I tried loading it from @graknlabs_client_java, it complained about the fact that @stackb_rules_proto cannot be resolved, which is probably because the macros hasn't been called. Hhmmmm... 🤔

lolski commented 5 years ago

@haikalpribadi @vmax From investigating stackb/rules_proto I came up with a conclusion that their approach can't be used.

rules_proto's java_grpc_compile() is a macro which in the end will expand to a bunch of http_archives which are defined in the current workspace. This is a simple enough use case for a Bazel macro to fulfil:

def java_grpc_compile():
   http_archive(...)
   http_archive(...)

Our macro is more complicated since we want to load a macro which is loaded by another macro:

load("@graknlabs_build_tools", "graknlabs_build_tools")
graknlabs_build_tools()

# this load statement is only available after @graknlabs_build_tools is loaded
load("@graknlabs_build_tools//distribution:dependencies.bzl", "graknlabs_bazel_distribution")
graknlabs_bazel_distribution()

As stated in https://github.com/graknlabs/grakn/issues/4867#issuecomment-462734077, we can't make them a single macro like this since a Bazel macro can't contain load statement:

def graknlabs_build_tools_and_its_transitive_dependency():
  load("@graknlabs_build_tools", "graknlabs_build_tools")
  graknlabs_build_tools()

  load("@graknlabs_build_tools//distribution:dependencies.bzl", "graknlabs_bazel_distribution")
  graknlabs_bazel_distribution()

The solution would be to wait until Bazel has what's called the "recursive workspaces". There are demands for it and we can only wait and see:

haikalpribadi commented 5 years ago

One idea to try out in the future is to have the macro of a given repository to load the macro of its dependency first; i.e. a recursive macro declaration which we will resolve to the full list of transitive dependencies of workspaces. @lolski