pondd-project / pondd

BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Deployment of the Demo DID Finder #15

Open zonca opened 2 years ago

zonca commented 2 years ago

I plan to use the Demo DID Finder as a starting point for the SuperCDMS DID finder.

So first I am trying to deploy and test it.

Then I try to test using the servicex package like in simple_query.ipynb, and I get same error I was getting with the CERN opendata:

Traceback (most recent call last):
  File "simple_query.py", line 8, in <module>
    sx_dataset = ServiceXDataset("demo://dataset1", backend_name='dev_uproot')
  File "/home/zonca/miniconda3/envs/pondd/lib/python3.8/site-packages/servicex/servicex.py", line 220, in __init__
    end_point, token = config.get_servicex_adaptor_config(backend_name)
  File "/home/zonca/miniconda3/envs/pondd/lib/python3.8/site-packages/servicex/servicex_config.py", line 173, in get_servicex_adaptor_config
    config = self._get_backend_info(backend_name)
  File "/home/zonca/miniconda3/envs/pondd/lib/python3.8/site-packages/servicex/servicex_config.py", line 121, in _get_backend_info
    raise ServiceXException(f'Unable to find name/type {backend_name} '
servicex.utils.ServiceXException: (ServiceXException(...), 'Unable to find name/type dev_uproot in api_endpoints in servicex.yaml configuration file. Saw only names (default, default) and types (xaod, cms_run1_aod)')

instead if I try to test with post.py from the Girder DID finder I get:

python post.py 5000 yt.json 
Handling connection for 5000
{'message': 'DID scheme is not supported: demo'}

How do I tell ServiceX that the demo DID finder is available? @BenGalewsky @Michael-D-Johnson @gordonwatts

Michael-D-Johnson commented 2 years ago

@zonca I had a similar issue working with the yt DID finder. In my value.yaml file for the helm chart I have added a section for the yt girder transformer:

didFinder:
  girder:
    cachePrefix: null
    enabled: true
    image: servicex-did-finder-girder
    pullPolicy: Always
    tag: latest
  CERNOpenData:
   ...

So in your case I think you'd replace girder with "demo" and the relevant image name and tags.

The deployment.yaml file that you pasted above I have living in its own folder in the servicex helm directory structure: templates/did-finder-girder/deployment.yaml.

zonca commented 2 years ago

Thanks @Michael-D-Johnson, I want to keep the charts separated, so I edited the ConfigMap pondd-servicex-flask-config, and set demo in VALID_DID_SCHEMES and the default, restarted the pods and now your post.py is working! I get:

{'request_id': 'xxxxxxxx'}

Later I'll find a way to do this cleanly.

Next I would like to be able to access the data with Python, so I am trying with:

#!/usr/bin/env python
# coding: utf-8

# # A Simple DID Finder
from func_adl_servicex import ServiceXSourceUpROOT
from servicex import ServiceXDataset

sx_dataset = ServiceXDataset("demo://dataset1", backend_name='dev_uproot')
ds = ServiceXSourceUpROOT(sx_dataset, "mini")

data = ds.Select("lambda e: {'JetPT': e['jet_pt']}").AsAwkwardArray().value()

data['JetPT']

But I get the same error above, @BenGalewsky do you know how to fix this?

zonca commented 2 years ago

@BenGalewsky it seems like for development I can specify no backend and the client will connect to ServiceX directly on localhost:5000.

so:

#!/usr/bin/env python
# coding: utf-8

# # A Simple DID Finder
from func_adl_servicex import ServiceXSourceUpROOT
from servicex import ServiceXDataset

sx_dataset = ServiceXDataset("demo://dataset1")
ds = ServiceXSourceUpROOT(sx_dataset, "mini")

data = ds.Select("lambda e: {'JetPT': e['jet_pt']}").AsAwkwardArray().value()

data['JetPT']

the code runs until ds.Select, there gives the error:

ServiceXException: (ServiceXException(...), 'ServiceX rejected the transformation request: (400){"message": "Failed to submit transform request: Failed to generate translation code: Unknown id: ResultTTree"}\n')

am I missing anything in the deployment?

BenGalewsky commented 2 years ago

This is going to be messy until we get @Michael-D-Johnson 's python-function based code generator. Which CodeGenerator are you deploying with helm?

It think it might work with sslhep/servicex_code_gen_func_adl_uproot

zonca commented 2 years ago

yes, @BenGalewsky I have that codegenerator in my configuration https://github.com/pondd-project/ServiceX_DID_Finder_Demo/blob/main/values_minimal.yaml

This is the example code from the Demo DID Finder, https://github.com/ssl-hep/ServiceX_DID_Finder_Demo/blob/main/sample/simple_query.ipynb, so it should work. I am worried there is something missing in the deployment.

zonca commented 2 years ago

Posting some logs.

See test files I'm using at:

https://github.com/pondd-project/ServiceX_DID_Finder_Demo/tree/main/tests

Post request on port 5000

If I do a POST request with python tests/post.py 5000 tests/yt.json I get:

Codegen

/tmp/tmpu3u7pvvg/44568cb73bc69fd2fe0c48859dff0332
total 4.0K
-rw-r--r-- 1 servicex servicex 1.4K Jan  9 05:59 generated_transformer.py
10.244.0.10 - - [09/Jan/2022:05:59:32 +0000] "POST /servicex/generated-code HTTP/1.1" 200 687 "-" "python-requests/2.27.1"

Demo DID finder

INFO pondd-servicex demo_did_finder None Received DID request {'request_id': 'c0d894db-f5e3-4b0f-9e51-637673b2015d', 'did': 'dataset1', 'service-endpoint': 'http://pondd-servicex-servicex-app:8000/servicex/internal/transformation/c0d894db-f5e3-4b0f-9e51-637673b2015d'}
INFO pondd-servicex demo_did_finder c0d894db-f5e3-4b0f-9e51-637673b2015d Looking up dataset {did_name}

Python

python tests/simple_query.py

Codegen

Traceback (most recent call last):
  File "/home/servicex/servicex/code_generator_service/generate_code.py", line 42, in post
    zip_data = self.translator.translate_text_ast_to_zip(code)
  File "/home/servicex/servicex/code_generator_service/ast_translator.py", line 101, in translate_text_ast_to_zip
    r = self.get_generated_uproot(a, tempdir)
  File "/home/servicex/servicex/code_generator_service/ast_translator.py", line 71, in get_generated_uproot
    src = generate_python_source(a)
  File "/usr/local/lib/python3.7/site-packages/func_adl_uproot/translation.py", line 23, in generate_python_source
    source += '    return ' + python_ast_to_python_source(python_ast) + '\n'
  File "/usr/local/lib/python3.7/site-packages/func_adl_uproot/translation.py", line 10, in python_ast_to_python_source
    return PythonSourceGeneratorTransformer().get_rep(python_ast)
  File "/usr/local/lib/python3.7/site-packages/func_adl_uproot/transformer.py", line 64, in get_rep
    node = self.visit(node)
  File "/usr/local/lib/python3.7/site-packages/func_adl_uproot/transformer.py", line 52, in visit
    return super(PythonSourceGeneratorTransformer, self).visit(node)
  File "/usr/local/lib/python3.7/ast.py", line 271, in visit
    return visitor(node)
  File "/usr/local/lib/python3.7/site-packages/func_adl_uproot/transformer.py", line 306, in visit_Call
    func_rep = self.get_rep(node.func)
  File "/usr/local/lib/python3.7/site-packages/func_adl_uproot/transformer.py", line 64, in get_rep
    node = self.visit(node)
  File "/usr/local/lib/python3.7/site-packages/func_adl_uproot/transformer.py", line 52, in visit
    return super(PythonSourceGeneratorTransformer, self).visit(node)
  File "/usr/local/lib/python3.7/ast.py", line 271, in visit
    return visitor(node)
  File "/usr/local/lib/python3.7/site-packages/func_adl_uproot/transformer.py", line 122, in visit_Name
    node.rep = self.resolve_id(node.id)
  File "/usr/local/lib/python3.7/site-packages/func_adl_uproot/transformer.py", line 116, in resolve_id
    raise NameError('Unknown id: ' + id)
NameError: Unknown id: ResultTTree
10.244.0.10 - - [09/Jan/2022:06:05:29 +0000] "POST /servicex/generated-code HTTP/1.1" 500 39 "-" "python-requests/2.27.1"

Demo DID finder

nothing

gordonwatts commented 2 years ago

I'm not totally understanding these logs here, I'm afraid - Demo DID Finder and CodeGen appear twice, for example.

Are you expecting the bomb in Codegen? Or is that the question you have?

gordonwatts commented 2 years ago

I also am a little puzzeled by this line:

INFO pondd-servicex demo_did_finder c0d894db-f5e3-4b0f-9e51-637673b2015d Looking up dataset {did_name}

It looks like we forgot a f a bit. But I don't see that in the did library or demo. What log file is that from? Is that from the ServiceXApp log file?

zonca commented 2 years ago

I'm not totally understanding these logs here, I'm afraid - Demo DID Finder and CodeGen appear twice, for example.

I have the logs of both of them for 2 different cases:

Are you expecting the bomb in Codegen? Or is that the question you have?

I would like to understand why:

python tests/simple_query.py

is not working. I don't understand what is the purpose of Codegen.

See https://github.com/pondd-project/ServiceX_DID_Finder_Demo/blob/main/tests/simple_query.py, it is based off your https://github.com/ssl-hep/ServiceX_DID_Finder_Demo/blob/main/sample/simple_query.ipynb

BenGalewsky commented 2 years ago

Ok, @zonca - I deployed my local serviceX and did some snooping around. First of all, the helm chart doesn't support the demo DID finder. It won't deploy it into the cluster, and even if you manually deploy it, the App will refuse to route to it. You would have to add it to the list of valid DID finders.

Consequently, I decided to run your example against the CERN open data DID finder which is supported. I changed the code to match our current libraries. (There is a distressing number of stale examples and incorrect documentation out there)

from func_adl_servicex import ServiceXSourceUpROOT

ds = ServiceXSourceUpROOT("cernopendata://1507", "nominal")

data = ds.Select("lambda e: {'JetPT': e['jet_pt']}").AsAwkwardArray().value()

data['JetPT']

This is sufficient to trigger the DID finder (and then the transformers will fail due to the files not being in the format this specific request is expecting).

My suggestion is to

  1. Deploy the CERN open data DID finder and get this example to trigger the DID finder first.
  2. Make a copy of the demo DID finder and call it SuperCDMSDataFinder and extend the helm chart to allow it to deploy this DID finder and to add it to the list of valid DID Finders
  3. Try this example again against your new DID Finder.

We are getting to the point that we need to rethink this aspect of the helm chart since this will quickly become unsustainable.

zonca commented 2 years ago

Ok, @zonca - I deployed my local serviceX and did some snooping around. First of all, the helm chart doesn't support the demo DID finder. It won't deploy it into the cluster, and even if you manually deploy it, the App will refuse to route to it. You would have to add it to the list of valid DID finders.

yes, I do that with kubectl edit after deployment, see https://github.com/pondd-project/pondd/issues/15#issuecomment-982889834.

In fact if I do python tests/post.py 5000 tests/yt.json (https://github.com/pondd-project/ServiceX_DID_Finder_Demo/tree/main/tests), the Demo DID finder logs show a request coming through.

What is the Python version doing differently?

Consequently, I decided to run your example against the CERN open data DID finder which is supported. I changed the code to match our current libraries. (There is a distressing number of stale examples and incorrect documentation out there)

from func_adl_servicex import ServiceXSourceUpROOT

ds = ServiceXSourceUpROOT("cernopendata://1507", "nominal")

data = ds.Select("lambda e: {'JetPT': e['jet_pt']}").AsAwkwardArray().value()

data['JetPT']

This is sufficient to trigger the DID finder (and then the transformers will fail due to the files not being in the format this specific request is expecting).

I would like to have a working example, if this is not fully working either, I'd like to see if we can make the Demo example working.

My suggestion is to

  1. Deploy the CERN open data DID finder and get this example to trigger the DID finder first.
  2. Make a copy of the demo DID finder and call it SuperCDMSDataFinder and extend the helm chart to allow it to deploy this DID finder and to add it to the list of valid DID Finders
  3. Try this example again against your new DID Finder.

We are getting to the point that we need to rethink this aspect of the helm chart since this will quickly become unsustainable.

As a first step, you could make the list of valid DID finders as a configuration option of the Helm chart, that should be easy to do.

gordonwatts commented 2 years ago

Ok, thanks. Now I understand a bit better what is going on!

As to the differences between the Jupyter notebook and python versions - I can only imagine from end up hitting different backends. The following line makes the difference here:

sx_dataset = ServiceXDataset("demo://dataset1", backend_type='dev_uproot')

According to the jupyter notebook page, it didn't find the dev_uproot in the servicex.yaml files, so it just assumed you wanted the development version. As long as port 5000 is available from wherever your jupyter server is running, then it worked. The demo is designed to return one file no matter what dataset it is give, and it looks like that worked.

The above errors, etc., are pretty confusing, so I'm not sure what exactly you are looking at - you mentioned two different things you've tried. First one is python tests/post.py 5000 tests/yt.json. Here you are generating a straight REST API request. First, just to make sure I understand, the 5000 tests/yt.json are not relevant - everything in that is hardcoded. In that case, it looks like things worked as you expected, and the demo got routed to the appropriate place.

In your jupyter notebook page it looks like ipywidgets isn't installed. Makes the progress bars look graphical, btw, when running.

Next, was an error associated with python tests/simple_query.py. Here the problem was nothing showed up in the did finder, and there was a crash in the codegen. First, I don't see any output from the python command at the terminal. If you include the dev_uproot (e.g. uncomment it), then you should see the same warning message you saw in the jupyter notebook. Could you make sure to do that? The ReturnTTree error is... very odd. ResultTTree comes into it only when you are aiming at the xAOD backend. But you are using a ServiceXSourceUpROOT - which should be fine. In order to better track this down, I think we need to see what is going on at the command line. Could you add the following two lines to the top of the simple_query call and then paste the results:

import logging
logging.basicConfig(level=logging.DEBUG)

This will allow us to see the actual func_adl statement that was sent down. It is possible the ResultTTree was added, it is odd that it did. In my experience, if the inputs are the same then the scripts behave identically in Jupyter notebooks and in python files (and that is certainly the design).

You asked what the point of the codegen step was. It's job is to translate the selection string in the RESTAPI call into something that is easily used by the transformer. Originally, we thought it might compile actual C++ code, but that turned out not to be efficient in the end. Once it is done doing the prep, it builds a configmap (I think that is the right name for it), and that map is made available to all the transformers. Thus they get access to the query via that route. For you, with the python transformers, you could just write a single file, and have the transformer pick that up and run with it.

Finally, there is an intermediate level. Since you aren't using func_adl in the end, you can just use the servicex package directly. This allows you to "paste" in the selection text directly. In short, it is somewhere between the raw RESTAPI call you have and the full call you have here. YOu might try something like this:

    from servicex import ServiceXDataset
    query = "<selection text from tests/post.py"
    dataset = "demo://dataset1"
    ds = ServiceXDataset(dataset, backend_name=`dev_uproot`)
    r = ds.get_data_awkward(query)
    print(r)

I'll watch this thread for more updates.

zonca commented 2 years ago

thanks @gordonwatts!

Here is the output of simple_query.py with logging enabled: https://gist.github.com/fbbff2a73bbb3afac518e9abe95a87c3

zonca commented 2 years ago

The intermediate level test:

import logging
logging.basicConfig(level=logging.DEBUG)
from servicex import ServiceXDataset
query = "(Select (Where (call EventDataset 'mini') (lambda (list e) (or (attr e 'trigE') (attr e 'trigM')))) (lambda (list e) (dict (list 'lep_pt' 'lep_eta' 'lep_phi' 'lep_energy' 'lep_charge' 'lep_ptcone30' 'lep_etcone20' 'lep_type' 'lep_trackd0pvunbiased' 'lep_tracksigd0pvunbiased' 'lep_z0') (list (attr e 'lep_pt') (attr e 'lep_eta') (attr e 'lep_phi') (attr e 'lep_E') (attr e 'lep_charge') (attr e 'lep_ptcone30') (attr e 'lep_etcone20') (attr e 'lep_type') (attr e 'lep_trackd0pvunbiased') (attr e 'lep_tracksigd0pvunbiased') (attr e 'lep_z0')))))",
dataset = "demo://dataset1"
ds = ServiceXDataset(dataset, backend_name="dev_uproot")
r = ds.get_data_awkward(query)
print(r)

gives:

Traceback (most recent call last):
  File "tests/intermediate_level_test.py", line 6, in <module>
    ds = ServiceXDataset(dataset, backend_name="dev_uproot")
  File "/home/zonca/miniconda3/envs/pondd/lib/python3.8/site-packages/servicex/servicex.py", line 220, in __init__
    end_point, token = config.get_servicex_adaptor_config(backend_name)
  File "/home/zonca/miniconda3/envs/pondd/lib/python3.8/site-packages/servicex/servicex_config.py", line 173, in get_servicex_adaptor_config
    config = self._get_backend_info(backend_name)
  File "/home/zonca/miniconda3/envs/pondd/lib/python3.8/site-packages/servicex/servicex_config.py", line 121, in _get_backend_info
    raise ServiceXException(f'Unable to find name/type {backend_name} '
servicex.utils.ServiceXException: (ServiceXException(...), 'Unable to find name/type dev_uproot in api_endpoints in servicex.yaml configuration file. Saw only names (default, default) and types (xaod, cms_run1_aod)')
BenGalewsky commented 2 years ago

Check in your .servicex file - there needs to be an entry with name: dev_uproot

BenGalewsky commented 2 years ago

We are very close to deploying the YT astrophysics ServiceX with a yt-Hub DID Finder, and new python function based selection. Hopefully that will be easier for you to work with and will be directly on the path to our SuperCDMS deployment

zonca commented 2 years ago

@gordonwatts here is the error when I run simple_query.py specifying dev_uproot:

Traceback (most recent call last):
  File "tests/simple_query.py", line 11, in <module>
    sx_dataset = ServiceXDataset("demo://dataset1", backend_name='dev_uproot')
  File "/home/zonca/miniconda3/envs/pondd/lib/python3.8/site-packages/servicex/servicex.py", line 220, in __init__
    end_point, token = config.get_servicex_adaptor_config(backend_name)
  File "/home/zonca/miniconda3/envs/pondd/lib/python3.8/site-packages/servicex/servicex_config.py", line 173, in get_servicex_adaptor_config
    config = self._get_backend_info(backend_name)
  File "/home/zonca/miniconda3/envs/pondd/lib/python3.8/site-packages/servicex/servicex_config.py", line 121, in _get_backend_info
    raise ServiceXException(f'Unable to find name/type {backend_name} '
servicex.utils.ServiceXException: (ServiceXException(...), 'Unable to find name/type dev_uproot in api_endpoints in servicex.yaml configuration file. Saw only names (default, default) and types (xaod, cms_run1_aod)')
zonca commented 2 years ago

Check in your .servicex file - there needs to be an entry with name: dev_uproot

where do I find that file?

BenGalewsky commented 2 years ago

See https://github.com/ssl-hep/ServiceX_frontend#configuration

Do you have auth enabled for your site? If so, you can download it from the ServiceX dashboard. Otherwise you just construct it with the url to your deployment following the guidance in the README.

Mine looks like:

api_endpoints:
  - name: local-uproot
    endpoint: http://localhost:5000
    type: uproot
zonca commented 2 years ago

ok, thanks @BenGalewsky I created that file: https://github.com/pondd-project/ServiceX_DID_Finder_Demo/commit/3132b328625a8c3ca6f23567105a970322f15b67 Now I get a different error:

https://gist.github.com/cbe1d3c59cbdd1edce585c231839f5d3

All my configuration is at: https://github.com/pondd-project/ServiceX_DID_Finder_Demo, mostly values_minimal.yaml and deploy.yaml.

zonca commented 2 years ago

My deployment with kind is detailed in https://github.com/pondd-project/ServiceX_DID_Finder_Demo#test-locally-with-kind and the Makefile

zonca commented 2 years ago

also I think kind is an easy way to setup a development environment and could also be used for CI on Github https://github.com/marketplace/actions/kind-kubernetes-in-docker-action

Michael-D-Johnson commented 2 years ago

Your error log shows that ServiceX can't find your Minio login information. I don't see accessKey or secretKey specified for your minio app. For example, in my deployment I have in my values.yaml file:

# Values for Minio Chart
minio:
  # For easy testing we don't require PVs for minio
  persistence:
    enabled: false
  accessKey: miniouser
  secretKey: leftfoot1
  resources:
    requests:
      memory: 1Gi

If that is contained your tls secret you may need to update the env var that is passed to the ServiceX app (https://github.com/ssl-hep/ServiceX/blob/develop/servicex/templates/app/deployment.yaml).

       - name: MINIO_ACCESS_KEY
          valueFrom:
            secretKeyRef:
              name: {{ .Values.secrets }}
              key: accesskey
        - name: MINIO_SECRET_KEY
          valueFrom:
            secretKeyRef:
              name: {{ .Values.secrets }}
              key: secretkey

I have a simple deployment running python codegenerator and the yt/girder DID finder and transformer on my laptop using minikube. I use an emptyDir() minio instance so I haven't had to deal with TLS issues with my own deployment of Minio. If interested you can find the setup here: https://github.com/Michael-D-Johnson/servicex-yt-deployment

zonca commented 2 years ago

thanks @Michael-D-Johnson ! I followed your suggestion, for some reason I still get the same error:

servicex.utils.ServiceXException: (ServiceXException(...), 'Do not know or have enough information to create a Minio Login info ({\'request_id\': \'9c5f0f4c-2113-4a1c-acb3-d62f730bf353\', \'did\': \'demo://dataset1\', \'columns\': None, \'selection\': "(call Select (call EventDataset \'bogus.root\' \'mini\') (lambda (list e) (dict (list \'JetPT\') (list (subscript e \'jet_pt\')))))", \'tree-name\': None, \'image\': \'sslhep/servicex_func_adl_uproot_transformer:develop\', \'workers\': 20, \'result-destination\': \'object-store\', \'result-format\': \'root-file\', \'workflow-name\': \'selection_codegen\', \'generated-code-cm\': \'9c5f0f4c-2113-4a1c-acb3-d62f730bf353-generated-source\', \'status\': \'Submitted\', \'failure-info\': None, \'app-version\': \'1.0.0rc3\', \'code-gen-image\': \'sslhep/servicex_code_gen_func_adl_uproot:develop\'})')

full log: https://gist.github.com/bbe1ad764b375f88214b3fc9aa1f9dfa

Michael-D-Johnson commented 2 years ago

Hmm. I wonder if there is an issue with the TLS. I'm curious if you get the same error if you deploy a minimal Minio instance:

minio:
  persistence:
    enabled: false
  accessKey: miniouser
  secretKey: leftfoot1

If we can get this bare bones version to work with your deployment then we can work up to deploying with your pondd-minio.zonca.dev Minio instance.