Closed fhoering closed 5 months ago
We're looking into this and will get back to you asap
Hi,
If you run this locally, could you invoke it with flag --udf_timeout=120s
or some large duration (currently the timeout is only 5s). Using the quick start guide as an example this flag would need to be added here like other flags.
If you run this on AWS, could you change this parameter in parameter store
Sorry I meant --udf_update_timeout=120s
for local
OK. Thanks. @peiwenhu. I can try this yes. But does this actually mean that my UDF takes more than 5 seconds to reply ?
Because the error happens when the DELTA file is loaded, not when the request is executed.
Given the stack trace I was thinking it might come from the riegeli record reader for example: (https://github.com/google/riegeli/blob/914e276f53b9c3b18528cbf8908f1ff92310195e/riegeli/base/parallelism.cc#L61)
The UDF is loaded when it's DELTA file is loaded so that's expected.
There are a few possibilities. It's not easy for us to conclude yet with only the information available here. By increasing the timeout we will have a better understanding. On our end, loading 30mb doesn't time out. So it might be a difference between your machine and ours or some other reason. If after increasing timeout to a sufficiently large number you still run into this issue, that'd be a strong indicator that something else is fishy.
Ok. I will try this. I got confused at one point because this flag is only available in the main branch not in 0.16 release. But I will build a new binary.
@peiwenhu Ok. It seems better. I'm able to run this WASM file now.
Did the logs change in the latest version ? Before I was running like this to get all logs including JS output:
--logging_verbosity_level=9 --stderrthreshold=0
Since you are running locally, could you build with the nonprod_mode flag? The JS logs won't show in prod mode, but will in nonprod_mode.
Here are the updated instructions for building locally: https://github.com/privacysandbox/protected-auction-key-value-service/blob/main/docs/deployment/deploying_locally.md#build-the-local-binary
@lusayaa I tried to do compile a new version with this flag without success. Compiled main with:
./builders/tools/bazel-debian build //components/data_server/server:server \
--config=local_instance \
--config=local_platform \
--config=nonprod_mode
./server --delta_directory=/tmp/deltas --realtime_directory=/tmp/realtime --port=123 --route_v1_to_v2=true --logging_verbosity_level=9 --stderrthreshold=0 --udf_update_timeout=120s --udf_timeout=120s
Those are the logs I see but a lot are missing:
Warning: server token is not set, consent log is turned off.
I0610 11:22:10.599415 101 sandbox.cc:322] CALL ENTRY: 'RunCodeFromSerializedData' with 3 argument(s)
I0610 11:22:10.599750 101 sandbox.cc:277] Synchronization (TO), ptr 0x3516ffa05ec0, Type: 3 for var: LenVal
I0610 11:22:10.599779 101 var_abstract.cc:96] TransferToSandboxee for: Structure of size: 16, local: 0x3516ffa15580, remote: 0x5573dc2f7420, size: 16
I0610 11:22:10.599793 101 var_abstract.cc:96] TransferToSandboxee for: Array, elem size: 1 B., total size: 0 B., nelems: 0, local: 0, remote: 0x5573dc2e6330, size: 0
I0610 11:22:10.599797 101 sandbox.cc:369] CALL ARG: (0), Type: Pointer, Size: 8, Val: Ptr to obj:0x3516ffa154a0 (type:'LengthValue' val:'LenVal'), local:(nil), remote:0x5573dc2f7420, size:0
I0610 11:22:10.599801 101 sandbox.cc:369] CALL ARG: (1), Type: Integer, Size: 4, Val: 296
I0610 11:22:10.599903 101 sandbox.cc:369] CALL ARG: (2), Type: Pointer, Size: 8, Val: Ptr to obj:0x7f56814f5708 (type:'Integer' val:'0'), local:0x7f56814f5748, remote:0x5573dc3a6850, size:8
I0610 11:22:12.928461 101 sandbox.cc:296] Synchronization (FROM), ptr 0x3516ffa05ec0, Type: 3 for var: LenVal
I0610 11:22:12.928508 101 var_abstract.cc:133] TransferFromSandboxee for: Structure of size: 16, local: 0x3516ffa15580, remote: 0x5573dc2f7420, size: 16
I0610 11:22:12.928524 101 var_abstract.cc:133] TransferFromSandboxee for: Array, elem size: 1 B., total size: 0 B., nelems: 0, local: 0x3516ffa02380, remote: 0x5573dc2e6330, size: 0
I0610 11:22:12.928527 101 sandbox.cc:296] Synchronization (FROM), ptr 0x3516ffa20e40, Type: 2 for var: 0
I0610 11:22:12.928531 101 var_abstract.cc:133] TransferFromSandboxee for: 0, local: 0x7f56814f5748, remote: 0x5573dc3a6850, size: 8
I0610 11:22:12.928533 101 sandbox.cc:398] CALL EXIT: Type: Integer, Size: 4, Val: 0
It worked before. So something must be broken. Can you have a look please ?
Hi @fhoering , are you able to see other server log lines such as data loading? Specifically are you able to see the log line about setting UDF object, loading UDF delta file is done and server starts watching new files, something like the following lines:
I0610 10:03:11.123586 2428735 data_orchestrator.cc:244] Loading /tmp/deltas/<your UDF delta file>
...
I0610 10:03:11.124105 2428825 data_orchestrator.cc:214] Setting UDF code snippet for version: 2, handler: HandleRequest,
...
I0610 10:03:11.134653 2428735 data_orchestrator.cc:360] Done loading /tmp/deltas/<your UDF delta file>
...
I0610 10:03:16.135478 2428826 change_notifier_local.cc:57] Watching for new files in directory: /tmp/deltas
I0610 10:03:21.135626 2428826 change_notifier_local.cc:67] No new files found within timeout, stopping.
I0610 10:03:21.135661 2428826 delta_file_notifier.cc:134] Backup poll
I0610 10:03:21.135743 2428826 delta_file_notifier.cc:233] No new file found
I0610 10:03:21.135761 2428826 change_notifier_local.cc:57] Watching for new files in directory: /tmp/deltas
No before on branch 0.16 I saw those log lines indeed. But with the instructions above on main branch I don't see them anymore.
Did you try to recompile that exact same way I did ?
@fhoering Yes, I compiled with exact command you used and I am able to see log lines I mentioned before.
To rule out the possibility of machine difference causing the issue. Could you please try to run the server locally in docker?
Commands to compile and run:
builders/tools/bazel-debian run //production/packaging/local/data_server:copy_to_dist --config=local_instance --config=local_platform --config=nonprod_mode
docker load -i dist/server_docker_image.tar
docker run -it --network=host -entrypoint=/server --init --rm --volume=/tmp/deltas:/tmp/deltas --volume=/tmp/realtime:/tmp/realtime --security-opt=seccomp=unconfined bazel/production/packaging/local/data_server:server_docker_image --port 50051 -route_v1_to_v2=true -logging_verbosity_level=9 -stderrthreshold=0 -udf_update_timeout=120s -udf_timeout=120s -delta_directory=/tmp/deltas -realtime_directory=/tmp/realtime
@emmafu2022 With the latest instructions it seems fine. But even the artifact published in bazel-bin/components/data_server/server/server
works now with a blank ubuntu:20.04
docker image. Not sure why exactly. So I will close for now and reopen if necessary.
Thanks
When loading a large WASM file (~30 MB) we are getting this error:
It prevents using this file for UDFs. Is it possible to increase this timeout somehow ?