We want to add JSON serialization for the plan nodes. Currently, the plan node PR is in the process of being reviewed, but we can still make progress on the serialization while thats happening. If you are able to help out, it should take you less than 30min to add support for a plan node.
Sign up for a plan type here. Feel free to do more than one! Once you get used to doing one, the rest will be fast.
Write the serialization and deserialization functions for the plan node. Look at LimitPlanNode as a basic example. This should be your starting point for understanding how to do this. If you are doing a join plan (*JoinPlanNode) or a scan plan (*ScanPlanNode), you can look at NestedLoopJoinPlanNode and IndexScanPlanNode as examples.
Uncomment the cases in the switch statement in abstract_plan_node.cpp:DeserializePlanNode that create your specific plan node.
Write a test case. Your test case should build a plan node using the builder object of the plan you created serialization for. You should serialize it, deserialize it, and compare to the original plan to make sure the plans are equal. Look at plan_node_json_test.cpp for an example for LimitPlanNode, IndexScanPlanNode, and NestedLoopJoinPlanNode.
Send me a patch (gangulo@andrew.cmu.edu) with your changes. Instructions on how to do that here
Thanks for helping out!
Some notes
Put the implementations of the serialization in the .cpp files, not in the .h files
You may need to do (de)serialization for any subtypes that are members of your plan node, such as AggregatePlanNode::AggregateTerm.
Code style for json fields is the same as variables: lower_case_with_underscores. Look at LimitPlanNode::ToJson as an example.
You may need to remove any const quantifiers on member variables so you can set their values during deserialization. This is okay.
Make sure to use the DEFINE_JSON_DECLARATIONS macro in the namespace of your plan node, not inside the class.
Make sure to call AbstractPlanNode::FromJson and AbstractPlanNode::ToJson from each (de)serializer for plan nodes. Example of this in LimitPlanNode. If you are doing a scan or join plan, you should call the parent's (AbstractJoinPlanNode or AbstractScanPlanNode) serialization methods, notAbstractPlanNodes methods. Example of this in IndexScanPlanNode and NestedLoopJoinPlanNode.
Look at the test case in plan_node_json_test.cpp to see how (de)serialization will work in practice.
Make sure to add a default constructor to your plan node.
Coding Best Practices
During deserialization, use at() method and not operator[] to extract data from json objects. operator[] has undefined behavior if the field doesn't exist, but at() will throw an exception. For example:
// GOOD
json.at("field").get<bool>();
// BAD
json["field"].get();
* Serializers and deserializers have already been written for oid types. You can easily serialize them as follows:
nlohmann::json j;
catalog::col_oid_t oid(0);
j["oid"]= oid;
auto deserialized_oid = j.at("oid").get()
* Use `parser::DeserializeExpression` to deserialize any `std::shared_ptr<AbstractExpression>` object. Also, when deserializing anything that is a pointer, you need to check that what was serialized was not a `nullptr`. You should do that using `json::is_null()` as follows:
auto expr = std::make_shared();
nlohmann::json j;
j["expr"] = expr;
std::shared_ptr deserialized_expr;
if (!j.at("expr").is_null())
deserialized_expr = parser::DeserializeExpression(j.at("expr"));
}
* The JSON library knows how to (de)serialize STL containers like `std::vectors` (given that what they hold is also (de)serializable). For example:
We want to add JSON serialization for the plan nodes. Currently, the plan node PR is in the process of being reviewed, but we can still make progress on the serialization while thats happening. If you are able to help out, it should take you less than 30min to add support for a plan node.
Instructions
GustavoAngulo/terrier:json_derulo
LimitPlanNode
as a basic example. This should be your starting point for understanding how to do this. If you are doing a join plan (*JoinPlanNode
) or a scan plan (*ScanPlanNode
), you can look atNestedLoopJoinPlanNode
andIndexScanPlanNode
as examples.abstract_plan_node.cpp:DeserializePlanNode
that create your specific plan node.plan_node_json_test.cpp
for an example forLimitPlanNode
,IndexScanPlanNode
, andNestedLoopJoinPlanNode
.Some notes
.cpp
files, not in the.h
filesAggregatePlanNode::AggregateTerm
.LimitPlanNode::ToJson
as an example.DEFINE_JSON_DECLARATIONS
macro in the namespace of your plan node, not inside the class.AbstractPlanNode::FromJson
andAbstractPlanNode::ToJson
from each (de)serializer for plan nodes. Example of this inLimitPlanNode
. If you are doing a scan or join plan, you should call the parent's (AbstractJoinPlanNode
orAbstractScanPlanNode
) serialization methods, notAbstractPlanNode
s methods. Example of this inIndexScanPlanNode
andNestedLoopJoinPlanNode
.plan_node_json_test.cpp
to see how (de)serialization will work in practice.Coding Best Practices
at()
method and notoperator[]
to extract data from json objects.operator[]
has undefined behavior if the field doesn't exist, butat()
will throw an exception. For example:// BAD json["field"].get();
nlohmann::json j; catalog::col_oid_t oid(0); j["oid"]= oid; auto deserialized_oid = j.at("oid").get()
auto expr = std::make_shared();
nlohmann::json j;
j["expr"] = expr;
std::shared_ptr deserialized_expr;
if (!j.at("expr").is_null())
deserialized_expr = parser::DeserializeExpression(j.at("expr"));
}
std::vector column_oids = {1,2,3};
j["column_oids"] = column_oids;
auto col_ids = j.at("column_oids").get<std::vector>();