Closed jhamon closed 2 months ago
Total nit that obviously you can ignore if you want, but maybe we should change the title of this PR from "Early access bulk import" to "Add early access bulk import", just so users later on know that this PR added the functionality (instead of iterated/removed/etc.)
Problem
Implement the following new methods:
start_import
describe_import
list_imports
cancel_import
Solution
Code generation changes
Since these features are in prerelease, they only exist in the spec for the upcoming 2024-10 API version. This required me to make modifications to the codegen script that is now run as:
The second boolean argument is used to tell the codegen script whether the generated code should be stored in a new
pinecone/core_ea
subpackage. In the future we should probably do more to hide this complexity from the developer, but for now it is good enough.Code organization
For the bespoke bits of the implementation that wrap the generated code, I have put them into a new class,
ImportFeatureMixin
, that theIndex
class inherits from. These functions could have all been implemented directly in theIndex
class, but I thought it a bit tidier to segregate these into a separate spot than just dump everything into one giant file.Overridden repr representation on generated objects
The default print output in the generated classes comes from pprint and it looks quite poor for large objects. So I installed overrides that dump the objects into a formatted json style instead. I had previously done something similar for describe_index, etc, methods, so for this PR it was just a matter of cleaning up that logic a bit and moving it somewhere it could be reused.
So far, I haven't tweaked the generated classes to do this approach across the board because it doesn't work well for long arrays of vector values.
Type of Change
Test Plan
Manual testing with a dev release is in this demo notebook