facebookincubator / Glean

System for collecting, deriving and working with facts about source code.
https://glean.software/
Other
962 stars 43 forks source link

Question about batched writes #456

Closed Boarders closed 3 months ago

Boarders commented 4 months ago

In the docs here you explain how to do batched writes: https://glean.software/docs/write/#creating-a-database-using-the-command-line

but earlier in the document it says: "It is only possible to refer to ids from facts in the same file, if you are writing multiple files using glean write or via the sendJsonBatch API."

I am wondering then how does one refer to earlier facts in a later file or if it is no longer possible to refer to such facts (perhaps it is possible by something other than the id)? What is the currently recommended way to handle a case like this?

simonmar commented 3 months ago

Hi, sorry for the delayed response. Glean does allow facts in a JSON file to refer to existing facts in the DB. However, this is not as easy to use as you may think: within a single JSON file you can refer to facts defined earlier in the file, but if the fact is defined in another file then you can only refer to it by the fact ID assigned to it by Glean, which is not the same as the fact ID used in the JSON file. The only way to find out the real fact ID of a fact is to perform a query - this is entirely possible and fully supported, and some indexers do exactly this.

Hope this helps.

Boarders commented 3 months ago

No problem at all, that makes sense - thanks for the reply, it is helpful