Open Ritaja opened 1 year ago
@Yuqing-cat to support @Ritaja on this issue.
For the first problem, add a log to inform user to use API way: https://github.com/feathr-ai/feathr/pull/892. For the second NGNIX problem, cannot repro it on my env. @Ritaja, could you help to dump the headers to understand why nginx rejects request from python client. The default nginx buffer size is 8k, that means client is send a request with header size over 8k, while our sample will not reach.
@Yuqing-cat I could reproduce through Azure deployment. Could you deploy Feathr components using guide here on azure and test with above code snippents ?
Willingness to contribute
Yes. I would be willing to contribute a fix for this bug with guidance from the Feathr community.
Feathr version
0.9.0
System information
(Problem is not specific to above system info, please refer below)
Describe the problem
Problem 1: The Feathr client communictes directly to Purview(Atlas) when environment var/config is set to purview name and no other details provided in feature registry:
os.environ['feature_registry__purview__purview_name'] = f'{purview_name}'
vs setting:os.environ['FEATURE_REGISTRY__API_ENDPOINT']= f'https://{resource_prefix}webapp.azurewebsites.net/api/v1'
Problem 2: When the client uses FEATHR REST API with backend as Purview registration of Features fails with NGINX error:
feathr_client.register_features()
Possible fix: adapting https://github.com/feathr-ai/feathr/blob/main/deploy/nginx.conf
Tracking information
No response
Code to reproduce bug
Feathr config yml (common for both problems):
Problem 1:
environment variable settings for Python client:
problematic setting:
add this to environment above:
os.environ['feature_registry__purview__purview_name'] = f'{purview_name}'
<-- this seems to use Purview client directly communicating to registry; not using REST APIif we define the REST API endpoint: instead of
os.environ['feature_registry__purview__purview_name'] = f'{purview_name}'
addos.environ['FEATURE_REGISTRY__API_ENDPOINT']= f'https://{resource_prefix}webapp.azurewebsites.net/api/v1'
then client uses REST APIProblem 2:
use tha same feather config yaml from above and the same environment variable but now force client to use REST API using
os.environ['FEATURE_REGISTRY__API_ENDPOINT']= f'https://{resource_prefix}webapp.azurewebsites.net/api/v1'
What component(s) does this bug affect?
Python Client
: This is the client users use to interact with most of our API. Mostly written in Python.Computation Engine
: The computation engine that execute the actual feature join and generation work. Mostly in Scala and Spark.Feature Registry API
: The frontend API layer supports SQL, Purview(Atlas) as storage. The API layer is in Python(FAST API)Feature Registry Web UI
: The Web UI for feature registry. Written in React