apache / incubator-graphar

An open source, standard data file format for graph data storage and retrieval.
https://graphar.apache.org/
Apache License 2.0
217 stars 46 forks source link

feat(java-info): Refactor the java-info module base on the format protocol files #595

Closed Thespica closed 1 week ago

Thespica commented 1 month ago

Reason for this PR

close #539

What changes are included in this PR?

  1. add dependency of proto module
  2. remove enum class, instead to proto's enum
  3. replacy fields of info class by proto objects

Are these changes tested?

No

Are there any user-facing changes?

no

SemyonSinchenko commented 1 month ago

Do I understand it right, that it is only part of Java refactoring?

Thespica commented 1 month ago

Do I understand it right, that it is only part of Java refactoring?

It's basic refactoring for info module. And later on, I will add some unit tests and doc.

acezen commented 1 month ago

Do I understand it right, that it is only part of Java refactoring?

It's basic refactoring for info module. And later on, I will add some unit tests and doc.

Seems good, maybe you can add a basic test that construct GraphInfo from current testing dir's graph yaml in this PR. You need to consider how to convert yaml to json and using JsonFormat to parse to protobuf, as we discussed in https://lists.apache.org/thread/o5bqbhxvcbm6xqj1j1m2h7bhdnsvgsoy

SemyonSinchenko commented 1 month ago

Do I understand it right, that it is only part of Java refactoring?

It's basic refactoring for info module. And later on, I will add some unit tests and doc.

My question was because I have concerns about the current implementation.

  1. Hadoop dependency. If we want to have a "kernel-like" approach to the Java info classes, do we really need Hadoop as a dependency?
  2. load and save methods: load is static and cannot be re-defined, save is Hadoop based;
  3. Do we really need a wrapper for each of proto object? I was thinking maybe we can work directly with proto-classes and provide only an additional functionality (like save, load, etc.)? For example all these methods that just do string concatenations: if anyone wants to implement own GAR-connector based on our classes, it will be hard to re-define all this methods, etc.

If this PR is a part of the refactoring let's merge it. But if it is a final state of Java GAR I would like to discuss it little more!

Nevertheless it is a very cool work!

Thespica commented 1 month ago

Hello @SemyonSinchenko

From our biweekly meeting note, this PR will work on:

SemyonSinchenko commented 1 month ago

Hello @SemyonSinchenko

From our biweekly meeting note, this PR will work on:

  • improve extendablility of load and save by loader/saver;
  • and support for json/binary format(extend test case).

@Thespica I did not get it clear, may you please explain a bit? Did you mean, that you would continue in this PR and implement what did you mention? Or did you mean, that we should merge this PR as is and you will continue what you mention in the next one?

acezen commented 1 month ago

Hello @SemyonSinchenko From our biweekly meeting note, this PR will work on:

  • improve extendablility of load and save by loader/saver;
  • and support for json/binary format(extend test case).

@Thespica I did not get it clear, may you please explain a bit? Did you mean, that you would continue in this PR and implement what did you mention? Or did you mean, that we should merge this PR as is and you will continue what you mention in the next one?

I think we can add a simple test case for this PR and merged, then continue which @Thespica mention in the next one.

Thespica commented 2 weeks ago

Hi @acezen @SemyonSinchenko please take a look.

I have test load and save locally, I well add it to CI and do some renaming later(not included in this PR).