Habush / atomspace-rpc

A gRPC server and client to execute pattern matching queries
GNU Affero General Public License v3.0
2 stars 1 forks source link

Could not find `protobufConfig.cmake` #4

Open linas opened 2 years ago

linas commented 2 years ago

I'm trying to build this package, (so that I can build the latest annotation-scheme, and its now failing with

CMake Error at src/CMakeLists.txt:10 (find_package):
  Could not find a package configuration file provided by "protobuf" with any
  of the following names:

    protobufConfig.cmake
    protobuf-config.cmake

... ???

linas commented 2 years ago

After stubbing out that code, I now get

CMake Error at src/CMakeLists.txt:14 (find_package):
  Could not find a package configuration file provided by "gRPC" with any of
  the following names:

    gRPCConfig.cmake
    grpc-config.cmake

I have both protobuf and grpc installed.

Habush commented 2 years ago

Can you try replacing https://github.com/Habush/atomspace-rpc/blob/731cceefc0ac2e9d3bc4a416afc63f73baa2d573/src/CMakeLists.txt#L10

with find_package(Protobuf REQUIRED) ?

Habush commented 2 years ago

As for the gRPC, please check this issue comment from the library maintainers. Basically, you'll have to use the following cmake options while installing gRPC

cmake -DgRPC_INSTALL=ON -DgRPC_BUILD_TESTS=OFF -DgRPC_PROTOBUF_PROVIDER=package -DgRPC_ZLIB_PROVIDER=package -DgRPC_CARES_PROVIDER=package -DgRPC_SSL_PROVIDER=package -DCMAKE_BUILD_TYPE=Release ..

I will send PR for this two issues tomorrow morning.

linas commented 2 years ago

find_package(Protobuf REQUIRED)

That fixed the protobuf issue! Thanks!

linas commented 2 years ago

For grpc, I just used sudo apt install libgrpc++-dev -- I'm just using ubuntu, trying not to build anything from source if at all possible.

linas commented 2 years ago

Well, this is weird. After your suggestion, protobuf was found:

-- Found Protobuf: /usr/lib/x86_64-linux-gnu/libprotobuf.so (found version "3.0.0") 
-- Using protobuf 

Then I ran cmake a second time, and it's giving me:

-- Found Protobuf: /usr/lib/x86_64-linux-gnu/libprotobuf.so (found version "3.0.0") 
-- Using protobuf 
CMake Error at /usr/share/cmake-3.10/Modules/FindProtobuf.cmake:229 (get_filename_component):
  get_filename_component unknown component NAME_WLE
Call Stack (most recent call first):
  /usr/share/cmake-3.10/Modules/FindProtobuf.cmake:293 (protobuf_generate)
  src/CMakeLists.txt:25 (protobuf_generate_cpp)

which it did not do, the first time around. I'm investigating now.

linas commented 2 years ago

I'm stumped. Here's the meta-issue: I wanted to dump the contents of the atomspace for @amirouche as an example of an "s-expression database". Along the way, I tried updating annotation-scheme, and then ran into these protobuf/grpc issues.

The meta-question: why does annotation-scheme depend on these? My thoughts work like this: basic modular design means that the basic component should not depend on network commo, and that networking should appear only in some other (optional) github repo. The reason for modular design is to avoid fragility and buildability issues like this....

Re: fragility: I'm also thinking that making annotation-scheme depend on a custom version of fibers is also asking for a maintainability and buildability nightmare: the code is fragile, easy-to-break. FWIW, from what I can tell, fibers is an abandonware project: it's got half-a-dozen unmerged pull reqs for assorted fixes, and no one is closing them or merging them. You might want to think about refactoring, so that fibers is no longer a dependency. This will give you a more robust, more powerful, easier-to-maintain package. Dependencies on weird packages ... hurt.

linas commented 2 years ago

I hacked around the NAME_WLE issue by changing it to NAME_WE in FindProtobuf.cmake -- apparently, this is something available only in the very newest cmake (which I do not have; I'm just using ubuntu)

linas commented 2 years ago

Ugh. Now I'm getting

CMake Warning at /usr/share/cmake-3.10/Modules/FindgRPC.cmake:72 (find_package):
  By not providing "FindProtobufWithTargets.cmake" in CMAKE_MODULE_PATH this
  project has asked CMake to find a package configuration file provided by
  "ProtobufWithTargets", but CMake did not find one.

  Could not find a package configuration file provided by
  "ProtobufWithTargets" with any of the following names:

    ProtobufWithTargetsConfig.cmake
    protobufwithtargets-config.cmake

I get the feeling this code is .. not well-maintained.

Habush commented 2 years ago

I haven't checked this code in a while. I will update it tomorrow.

linas commented 2 years ago

FWIW, I hacked around in annotation-scheme by simply not loading (opencog grpc) and so far, for whatever it is I'm doing, that's enough. (I'm not sure what I'm doing; I'm running old scripts)

Habush commented 2 years ago

Hi @linas , I have updated the CMAKE files and the README. Can you try again?

Habush commented 2 years ago

The meta-question: why does annotation-scheme depend on these?

The annotation-scheme code needs to access a remote Atomspace to execute pattern matching queries. I think you implemented a newer version of remote pattern execution code (I can't find the repo) and spent some time carefully designing it. Maybe I will to switch to using your version and retire this repo.

Re: fragility: I'm also thinking that making annotation-scheme depend on a custom version of fibers is also asking for a maintainability and buildability nightmare: the code is fragile, easy-to-break. FWIW, from what I can tell, fibers is an abandonware project

True. I am using fibers because I needed Go-like concurrency to pass messages between the code that run spattern matching queries, the atomese->json parser and the file writer that writes the results. And I am depending on a specific verison because the original version of fibers uses epoll API for I/O which is Linux specific. I hacked it around to make it depend on libevent for cross-platform compatibility (and make my life easier as I work on Mac OS). @aconchillo was kinda enough to go through my changes and add more of his and now its working but @wingo hasn't decided to merge it. So I'm stuck using this particular version. It's not actually necessary to use the libevent version for Linux users. From my experience, this is a general issue with the guile community as there aren't a lot of libraries and the ones that exist aren't updated frequently or even maintained.

You might want to think about refactoring, so that fibers is no longer a dependency. This will give you a more robust

I agree. And that will require a lot of changes. But right now, I am busy with other projects to expend such an effort on replacing fibers. Hopefully, I will do it some time soon :)

linas commented 2 years ago

I think you implemented a newer version of remote pattern execution code

Yes. Wiki page is here: https://wiki.opencog.org/w/StorageNode and the demos are here: https://github.com/opencog/atomspace/tree/master/examples/atomspace The correct order to read the demos is: persist-store.scm showing how to dump to a flat file, peristance.scm for Rocks or SQL or CogServer, persistance-query.scm for running a remote query, and persist-multi.scm for using multiple storage providers at once. And distributed.scm for doing all of the above over the network.

It's one common API that works for file, SQL, RocksDB and network access. See https://wiki.opencog.org/w/CogStorageNode for the network variant.

(I can't find the repo)

To use the network node, two are needed: https://github.com/opencog/cogserver and https://github.com/opencog/atomspace-cog

FWIW, one could "easily" implement any other kind of networking using this same API. Although I don't see much point, since I think the CogStorageNode will be pretty danged fast. It's got very little overhead.

linas commented 2 years ago

BTW: distantly related, but perhaps of interest: I have this vague idea that perhaps it would be possible to write a shim that maps specific atoms to specific dataset queries, at run-time, on demand.

So, for example: currently, you use a bulk importer to import all kinds of different bio data sources into the atomspace. This means that you have to wait, maybe a long time, before all those datasets get loaded. I'm thinking that it might be possible to write a mapper that converts (Evaluation (Predicate "foo") (Protein "bar") ...) into some specific query into some specific datasource, but only when you ask for that particular atom (set of atoms) i.e. when you query for them.

Right now, this is just a hunch - I think it should be possible, but hard to tell till one tries to code it up. It seems like it could be a good way of interfacing to multiple data sources at once, instead of having to import them. Even better if it can be made read/write. Assuming you can live with the latency / overhead.

This wouldn't be networked (unless the datasource is networked) -- rather, its just s shim for converting Atomese into some other format, and back, on the fly, as-needed.

linas commented 2 years ago

I have updated the CMAKE files and the README. Can you try again?

Sigh. The instructions call for building grpc from source, and that seems like a step too far. I'm going to punt. Mostly, I just needed to export the old genome dataset so that @amirouche could look at it; I did that, and so I don't need the rest of the baggage.

Habush commented 2 years ago

You are referring to something like Neo4j ETL Tool but not limited to RDBMS, right?. And I think this is a great idea. It will reduce time of writing code to convert some data format to s-exprs and load it Atomspace if it supports common data formats.

Assuming you can live with the latency / overhead

Users who don't want to deal with the latency can wait till everything is loaded to the Atomspace and proceed from there. But there will be cases where this won't be much of a problem, so it is win-win.

Habush commented 2 years ago

Sigh. The instructions call for building grpc from source, and that seems like a step too far. I'm going to punt. Mostly, I just needed to export the old genome dataset so that @amirouche could look at it; I did that, and so I don't need the rest of the baggage.

Completely understood. I remember also being frustrated by protobuf/grpc cmake issues while writing the code. With regards to JSON version of the bio dataset, I think @tanksha has already converted it to JSON and can share.

tanksha commented 2 years ago

I used the annotation-scheme JSON parser and covert the bio-atomspace into JSON, it can be found here https://mozi.ai/datasets/bioas-json/. Note that the format is designed to be compatible with the cytoscape visualizer.

As for the GRPC dependency issue, I set (setenv "TEST_MODE" "TRUE") to run PM queries on the default local AtomSpace and comment the (opencog grpc) module.

linas commented 2 years ago

ETL tool

Yes. Something like that.

JSON

I made a strong argument about why JSON is exactly the wrong format. (I mean, it might be great for the bio data, its just wrong in general, for other things) The general discussion is about a database of s-expressions. There already exist several databases of JSON expressions, at least some having commercial companies behind them. I think that a database of s-expressions is still a good commercial startup idea. Just that, well, the local crowd around these parts does not have much business sense. But if you think you know someone who could be a tech startup CEO, or someone who can do marketing and sales, then building and selling an s-expression database -- I think its a viable business.

gl-yziquel commented 10 months ago

I built grpc from source. Took ages. But it worked fine on this point.

(I'm hitting another issue: https://github.com/Habush/atomspace-rpc/issues/6 )