Closed pablopareja closed 10 years ago
Sounds good because it allows us to work on a higher abstraction level.
I would appreciate a list of pre-defined steps for me to follow and implement.
OK, I will elaborate that list for you. In the mean time please have a look and start making some tests with gremlin and all these questions.
Hi André, did you start making the first tests with Gremlin?
Hi, I'm sorry but I've been busy with school work. I have my degree's final project to deliver on the 14th so I'll be a bit busy until then.
I've installed Gremlin and did some examples, is there any specific thing I should test?
Thanks
I guess you already had a look at it but just in case you didn't here is the domain model for Bio4j:
Cardinality of relationships is included
A preliminary list of user defined steps for proteins could be:
Could you give me a practical example?
Thanks for your patience.
Please note that I have little biology background and I don't know how most of bio4j works, documentation is very scarce, most links in the examples section are broken so I don't know how to start tackling this project.
All links broken: https://github.com/bio4j/bio4j/blob/master/docs/auxiliary-relationships.md This is helpful: https://github.com/bio4j/bio4j/blob/master/docs/node-indexing.md
Let's take for instance Protein Organism: We can query for a specific Scientific name index or NCBI taxonomy id index.
How should the output look, do we print all information about each protein that fits the query? In the requested format(Gexf/Graphml/GraphSON)?
OK so let me try to answer all your questions:
Keep in mind that in principle you won't be interacting with the database through the Java API but rather with Gremlin queries
- Regarding the user defined steps for proteins that I mentioned in the previous message they would all have as starting point a protein accession (or list of protein accessions)
- When you mention how the output should look, what do you exactly mean? If you are referring to the user defined steps I think you may have not understood what they're about. There's no XML output file to be exported at this point since steps are rather a way of encapsulating commonly used Gremlin sub-queries into just one keyword.
OK thanks, then could you give me an example implementation of an user defined step from those you've listed? I can pick it up from there and implement the remaining ones.
@andre-nunes what's up here?
I don't know how to define Gremlin steps in Java, care to demonstrate?
I've skimmed through this thread and I see that @pablopareja already answered on some of your questions. Don't hesitate to ask more, if your don't understand answers or something else.
I think, that I won't tell you better how to create a gremlin step than the guide from the gremlin docs and I'll be glad to answer your questions, if you don't understand something there.
Well the whole guide is geared towards Groovy so I don't know how to do it in Java.
Here's an user defined step:
Gremlin.defineStep('co',[Vertex,Pipe],
{String label -> _().as('x').out(label).in(label).except('x')})
In Java all I could find is this method:
Gremlin.defineStep(String arg0, List<Class> arg1, Closure arg2)
How do I represent [Vertex, Pipe]
as a List<Class>
?
Look, there is something relevant in gremlin docs: Using Gremlin through Java (+ this). Maybe something else, so I recommend you to search gremlin docs carefully (and just google particular gremlin/java things).
In general, groovy mostly just brings some syntax sugar for java, so google for it and try to interpret groovy constructions in java (and also try to use groovy in the gremlin REPL, why not? :wink:)
Regarding your last question: as the type says, it's just a list of classes, so I guess, it will be something like List(Vertex.class, Pipe.class)
in Java
My first try at this can be found here: https://github.com/bio4j/exporter/commit/919c9de070dd4bcb9b2d6b4c698d832c1a1656ca
It's still work in progress, it's not tested and I'm wondering if I could get a more detailed explanation of what each User Defined Step should do, at the moment they just iterate the vertices with a given relationship.
More info on the daily issue: https://github.com/bio4j/exporter/issues/18
Thanks
Hi André,
As stated in the project proposal and the rest of the documentation the language to be used is Java (not Groovy or anything similar)
Anyways I don't see why so much time can be needed to define a gremlin step when it's explained pretty clearly in the documentation how it should be done:
https://github.com/tinkerpop/gremlin/wiki/User-Defined-Steps
In fact, you can find this link on the second result when googling defining gremlin user steps in java
That's all in Groovy, no mention of Java, that's my problem.
I asked on Gremlin-users mailing-list and it's impossible to have named user-defined steps in Java: https://groups.google.com/forum/#!topic/gremlin-users/MaDb0Y1RqZo
I'll have to look into another approach, possibly using DSLs
Is the usage of Tinkerpop3 out of the question? Looks easier to use in Java.
It's a bit strange. They were going to release it in August. Have they done it already??
Seems like it, in the meantime I've finally figured out how to use GremlinPipeline
in Java so I've done some progress.
My question now is: What's will be the purpose of these steps in the context of the exporter? Will these be the queries that users can call through the CLI? If so I guess they don't care how this works in the background, at the moment these "steps" return a list with the iterated vertices, let me know explicitly what should be happening in the background.
I guess the user-defined steps should return a GremlinPipeline
instance so that we can continue appending pipes... :smile:
The thing is that they won't be callable from Gremlin REPL unless we use Groovy, which seems to be the point or am I wrong?
The use of Gremlin user defined steps would be strongly advisable in order to encapsulate sub-queries that could be used in different queries. Not only that, we should elaborate a list of such possible pre-defined steps so that we would end up having a sort of preliminary bio4j-gremlin-method-library.