Closed securigy closed 1 year ago
I see also errors in some of these Memory samples.
It has been 2 weeks since I posted my bug report. Is there any update on the status? Had it been reproduced? resolved? being worked on?
I managed to successfully produce a tool that can answer questions about private data - but I had to do it using OpenAI Embeddings API in C# + Pinecone C# bindings + OpenAI Completions. However, my goal is to be able to achieve it with SK. I think Microsoft need to hurry with SK as the open-source development with LangChain gains momentum (unfortunately in Python)
@dmytrostruk can you take a look please?
If you guys need me to share my screen and demo to you - I can do it.
The exception is: Microsoft.SemanticKernel.Connectors.Memory.Pinecone.PineconeMemoryException: 'Index creation is not supported within memory store. It should be created manually or using CreateIndexAsync. Ensure index state is Ready.'
@securigy The reason why you receive this exception on SaveInformationAsync
method is because SemanticTextMemory
logic checks whether collection exists and if not, it tries to create it. This flow works fine for other memory connectors, but not for Pinecone. That's because collection creation in Pinecone is asynchronous, meaning that when you receive HTTP response from Pinecone API after collection creation request, the creation process has been started on Pinecone side, but collection (i.e. index) is not provisioned yet, provisioning is in progress. As mentioned in exception message, index state should be Ready
before usage.
There are two ways how to resolve this problem:
CreateIndexAsync
in your code, but you have to add logic on your side to perform HTTP polling with your preferred strategy to get to the point, when index state on Pinecone side is Ready
, so then you will be able to proceed with other operations.Ready
.But, I do create index in the code as shown below and it ran to completion. (idx.Status = RanToCompletion)
Pinecone index state can have values: Initializing
, ScalingUp
, ScalingDown
, Terminating
, Ready
.
In your case, status RanToCompletion
it's not a state of Pinecone index creation, it's a status of your Task
(documentation: https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.taskstatus?view=net-7.0#fields).
You output status of Task
, because in your example there is no await
when calling DescribeIndexAsync
method, and you are working with Task
. You can either add await
or get index state details by using idx.Result.Status.State
.
Task<PineconeIndex> idx = pinecone.DescribeIndexAsync("pinecone-index");
idx.Wait();
Console.WriteLine("idx.Status: {0}", idx.Status);
and this behavior is observed not only in this example but in QDRANT example as well: Example19_Qdrant
It would be really helpful if you can provide more details about exception you are receiving in Qdrant example, so we can investigate the problem.
Thanks a lot!
Dmytro, your post is really detailed and seems helpful - I will try it as soon as I can. THANK YOU!
======== I wonder if you can provide a general guidance (not sure where else to ask):
The bottom line is: my head is spinning from multitude of option for model training through Embeddings (due to vector DB search options). Where can I find a top-level guidance or comparison between all options? and which can be used locally on my machine with or without complication of Docker...
@securigy In general, there is no specific recommendation what vector DB to use, because it all depends on what scenario you want to cover and what result you want to achieve. All options are different in terms of cost, performance and set of available features.
My recommendation would be to try each available option manually and see which one works best for your scenario. In repository we have an integration with multiple vector DBs including examples. Some of them can be used locally on your machine using Docker containers (e.g. Redis, Postgres, Qdrant, Chroma): https://github.com/microsoft/semantic-kernel/tree/main/dotnet/src/Connectors
I would also suggest joining our Discord so you will be able to discuss this question with wider community: https://aka.ms/SKDiscord
If you don't mind, I will close this issue, because your use-case has expected behavior due to limitations on Pinecone side. If you encounter any problems with Pinecone or any other functionality - feel free to open another issue, this will be very helpful.
Thank you!
I took the example source code - Example38_Pinecone, and pasted it into my simple WinForms application. The made some changes. Specifically I delete the all the existing indexes and collection before I do any operations, like creating index, etc. The entire code is below:
`
public static async Task RunAsync() { using (Log.VerboseCall()) { string apiKey = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"; string pineconeEnvironment = "us-west1-gcp-free";
The indicated line with the comment above throws exception, no matter what... and this behavior is observed not only in this example but in QDRANT example as well: Example19_Qdrant. The same line throw exception. So, at this point it looks like a general problem... Am I the only one trying these examples?
Just to note that I did deployed the correct resource in Azure with correct model "text-embedding-ada-002". Any my Pinecone key and environment that I got from Pinecone are correct, as well as ApiKey of OpenAI. Anything else maybe I need to do?
The exception is: Microsoft.SemanticKernel.Connectors.Memory.Pinecone.PineconeMemoryException: 'Index creation is not supported within memory store. It should be created manually or using CreateIndexAsync. Ensure index state is Ready.'
But, I do create index in the code as shown below and it ran to completion. (idx.Status = RanToCompletion)
Screenshots Stack Trace: at Microsoft.SemanticKernel.Connectors.Memory.Pinecone.PineconeMemoryStore.d2.MoveNext() in /home/vsts/work/1/s/semantic-kernel/dotnet/src/Connectors/Connectors.Memory.Pinecone/PineconeMemoryStore.cs:line 68
at Microsoft.SemanticKernel.Memory.SemanticTextMemory.d 3.MoveNext()
at GptModelTrainer.MainForm.d__21.MoveNext() in C:\Projects\GptModelTrainer\GptModelTrainer\MainForm.cs:line 376
Desktop (please complete the following information):