Open vikramsubramanian opened 7 months ago
Summary: Need to add support for exporting VARIANT data type to different formats and test integrations with networkx, pyg, and other libraries. Also need to test reading VARIANT data in Python, Java, Rust, and C++.
Based on the provided information, the following solutions can be applied to address the issue:
Ensure that the export functions export_literal_table_to_csv
and export_literal_table_to_parquet
in the Python code handle the VARIANT data type correctly by implementing the necessary data conversion logic for VARIANT to CSV and Parquet formats.
Update the integration test functions test_networkx_integration_with_literal_table
and test_pyg_integration_with_literal_table
to verify that networkx and pyg can handle Literal Node Tables containing VARIANT data types.
For the language bindings in Python, Java, Rust, and C++, implement the read_variant
function to correctly interpret and convert the VARIANT data type from the database into the respective language's native data types.
Add test cases for exporting the VARIANT data type to ensure that the export functions work correctly across different formats like CSV and Parquet.
Add test cases for the integrations with networkx and pyg to ensure that the VARIANT data type is supported and correctly handled when using these libraries.
Add test cases for each language binding (Python, Java, Rust, C++) to ensure that the VARIANT data type is correctly read and converted into the native data types of each language.
Review the code snippets provided for any relevant implementations or test cases that can be adapted or extended to support the VARIANT data type. If any existing code handles similar data types or export functionalities, use that as a reference for implementing support for VARIANT.
Ensure that the RdfReader
and RdfScan
classes in the C++ code correctly handle the VARIANT data type when reading from the database, and that the CastFromRdfVariant
template specializations correctly cast the VARIANT data type to the appropriate C++ data types.
Verify that the Node.js and Rust API tests cover the VARIANT data type and its conversions to ensure compatibility and correctness across different language bindings.
src/processor/operator/persistent/reader/rdf/rdf_reader.cpp
This file contains the logic for reading RDF data, which is relevant to handling the VARIANT data type in RDF exports.
src/function/cast/cast_rdf_variant.cpp
This file contains casting functions for RDF VARIANT data types, which are relevant to exporting these types to different formats.
src/processor/operator/persistent/reader/rdf/rdf_scan.cpp
This file contains the logic for scanning RDF data, which may need to be modified to support exporting VARIANT data types.
tools/python_api/test/test_arrow.py
This file tests the Python API's ability to handle different data types with Arrow, which is relevant for testing VARIANT data type support.
This file defines the logical types in Rust API, including handling of VARIANT data types.
Currently VARIANT data type is only supported as part of the Literal Node Table. You cannot create normal tables with VARIANT data type and therefore ingest CSV or PARQUET data. However you can still export the Literal table. We should figure out how to export VARIANT data type to different formats, such as CSV, Parquet and any other format we support.
We should also test how our integrations with networkx or pyg or any other integrations behave with Literal table.
Also: Let's test if we can use Variant in Python, Java, Rust, and C++. Can we read them correctly for each data type that it supports? )