Open DBundred-cfc opened 10 months ago
Hey @DBundred-cfc :wave:! Thank you so much for reporting the issue/feature request :rotating_light:. Someone from SynapseML Team will be looking to triage this issue soon. We appreciate your patience.
It is still present in the current version.
Spray seems to have a lot of issues with serializing/deserializing jsons. For example: spray.json.DeserializationException: Expected String as JsString, but got {"flags":null,"lowercase":true,"name":"patternAnalyzer","pattern":"\\s+","stopwords":[]}
SynapseML version
0.11.0-1.0.2
System information
Describe the problem
When a version of Synapse ML is used to load data into an index that has a custom analyzer or tokenizer (and possibly other custom objects but they haven't neem tested) it fails with the following error : -
This happens with all apiVersions set and seemingly any version greater than 0.11.0. It works correctly against the same index when run using a Spark 3.2 Azure Synapse cluster, which uses Synapse ML version 0.10.2
Code to reproduce issue
Create an index with a custom analyzer
This needs to be done through the API: - POST https://{{service-name}}.search.windows.net/indexes?api-version={{api-version}}
Try and load the index
Run the following pyspark on a spark 3.4
You can also run the same code (without the spark creation) on Azure Synapse 3.3 and get the same result. I imagine this will happen on Databricks, and Synapse 3.4 but haven't tested it.
Other info / logs
What component(s) does this bug affect?
area/cognitive
: Cognitive projectarea/core
: Core projectarea/deep-learning
: DeepLearning projectarea/lightgbm
: Lightgbm projectarea/opencv
: Opencv projectarea/vw
: VW projectarea/website
: Websitearea/build
: Project build systemarea/notebooks
: Samples under notebooks folderarea/docker
: Docker usagearea/models
: models related issueWhat language(s) does this bug affect?
language/scala
: Scala source codelanguage/python
: Pyspark APIslanguage/r
: R APIslanguage/csharp
: .NET APIslanguage/new
: Proposals for new client languagesWhat integration(s) does this bug affect?
integrations/synapse
: Azure Synapse integrationsintegrations/azureml
: Azure ML integrationsintegrations/databricks
: Databricks integrations