Azure / azure-functions-docker

This repo contains the base Docker images for working with azure functions
MIT License
269 stars 118 forks source link

Signal only works in main thread of the main interpreter #906

Open JanDubcak opened 1 year ago

JanDubcak commented 1 year ago

Issue description:

When I try to trigger robot frameork tests with pabot using http_trigger i get following error (happy pass and steps to reproduce is a bit lower): Is there an issue with signal handling inside of threads and azure functions? Many thanks for any response.

Function.rfBrowser[3]
      Executed 'Functions.rfBrowser' (Failed, Id=4967c840-f90b-450b-9859-0fd7befcf1a9, Duration=11ms)
      Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Exception while executing function: Functions.rfBrowser
       ---> Microsoft.Azure.WebJobs.Script.Workers.Rpc.RpcException: Result: Failure
      Exception: ValueError: signal only works in main thread of the main interpreter
      Stack:   File "/azure-functions-host/workers/python/3.10/LINUX/X64/azure_functions_worker/dispatcher.py", line 479, in _handle__invocation_request
          call_result = await self._loop.run_in_executor(
        File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
          result = self.fn(*self.args, **self.kwargs)
        File "/azure-functions-host/workers/python/3.10/LINUX/X64/azure_functions_worker/dispatcher.py", line 752, in _run_sync_func
          return ExtensionManager.get_sync_invocation_wrapper(context,
        File "/azure-functions-host/workers/python/3.10/LINUX/X64/azure_functions_worker/extension.py", line 215, in _raw_invocation_wrapper
          result = function(**args)
        File "/home/site/wwwroot/rfBrowser/__init__.py", line 6, in main
          robot_rc = main_program(["./rfBrowser/test.robot"])
        File "/usr/local/lib/python3.10/site-packages/pabot/pabot.py", line 1919, in main_program
          _parallel_execute(
        File "/usr/local/lib/python3.10/site-packages/pabot/pabot.py", line 1319, in _parallel_execute
          original_signal_handler = signal.signal(signal.SIGINT, keyboard_interrupt)
        File "/usr/local/lib/python3.10/signal.py", line 56, in signal
          handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler))

         at Microsoft.Azure.WebJobs.Script.Description.WorkerFunctionInvoker.InvokeCore(Object[] parameters, FunctionInvocationContext context) in /src/azure-functions-host/src/WebJobs.Script/Description/Workers/WorkerFunctionInvoker.cs:line 101
         at Microsoft.Azure.WebJobs.Script.Description.FunctionInvokerBase.Invoke(Object[] parameters) in /src/azure-functions-host/src/WebJobs.Script/Description/FunctionInvokerBase.cs:line 82       
         at Microsoft.Azure.WebJobs.Script.Description.FunctionGenerator.Coerce[T](Task`1 src) in /src/azure-functions-host/src/WebJobs.Script/Description/FunctionGenerator.cs:line 225
         at Microsoft.Azure.WebJobs.Host.Executors.FunctionInvoker`2.InvokeAsync(Object instance, Object[] arguments) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionInvoker.cs:line 52
         at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.InvokeWithTimeoutAsync(IFunctionInvoker invoker, ParameterHelper parameterHelper, CancellationTokenSource timeoutTokenSource, CancellationTokenSource functionCancellationTokenSource, Boolean throwOnTimeout, TimeSpan timerInterval, IFunctionInstance instance) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 581
         at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithWatchersAsync(IFunctionInstanceEx instance, ParameterHelper parameterHelper, ILogger logger, CancellationTokenSource functionCancellationTokenSource) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 527
         at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance, FunctionStartedMessage message, FunctionInstanceLogEntry instanceLogEntry, ParameterHelper parameterHelper, ILogger logger, CancellationToken cancellationToken) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 306
         --- End of inner exception stack trace ---
         at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance, FunctionStartedMessage message, FunctionInstanceLogEntry instanceLogEntry, ParameterHelper parameterHelper, ILogger logger, CancellationToken cancellationToken) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 352
         at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.TryExecuteAsync(IFunctionInstance functionInstance, CancellationToken cancellationToken) in D:\a\_work\1\s\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:line 108

Happy pass

However if I try to run it directly from python inside container with docker exec. It runs flawalessly.

docker exec -it robot /bin/bash
root@e74249e3dca0:/# cd /home/site/wwwroot/
root@e74249e3dca0:~/site/wwwroot# python3
Python 3.10.12 (main, Jun  7 2023, 19:37:06) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pabot.pabot import main_program
>>> main_program(["./rfBrowser/test.robot"])
2023-06-12 14:54:58.620289 [PID:228] [0] [ID:0] EXECUTING Test
2023-06-12 14:54:58.921917 [PID:228] [0] [ID:0] PASSED Test in 0.3 seconds
1 tests, 1 passed, 0 failed, 0 skipped.
===================================================
Output:  /home/site/wwwroot/output.xml
Log:     /home/site/wwwroot/log.html
Report:  /home/site/wwwroot/report.html
Total testing: 0.30 seconds
Elapsed time:  0.42 seconds
0

Env:

Docker container 4-python3.10** requirements.txt**

azure-functions
robotframework==6.0.2
robotframework-pabot==2.16.0

__init__.py in function rfBrowser:

import azure.functions as func
from pabot.pabot import main_program

def main(req: func.HttpRequest) -> func.HttpResponse:
    if req.method == "GET":
        robot_rc = main_program(["./rfBrowser/test.robot"])
        return func.HttpResponse(f"\n\n Robot output: {robot_rc} \n")

Content of test rfBrowser/test.robot:

*** Test Cases ***
Test run with pabot
    Log to console    Done
vijaykumar911 commented 1 year ago

Hi @JanDubcak will update you soon..

vijaykumar911 commented 1 year ago

Move the code that uses the signal module to the main thread. If possible, refactor your code so that the signal handling is performed within the main thread of the Azure Function. This may involve restructuring your code or using different libraries that do not rely on signals. Use a different approach for triggering the robot framework tests from within the Azure Function. Instead of using pabot with HTTP triggers, you can explore alternative methods such as invoking the tests directly without relying on pabot. This could involve calling the Robot Framework API directly or using other methods for test execution. Check for any available updates or patches for the libraries you are using that may address this issue. Updating to the latest versions of the libraries may include bug fixes or improvements related to thread handling and signal usage.

pombredanne commented 1 month ago

FWIW, this problem is now also showing up for things completely unrelated to Azure functions, using Azure devops Ci/CD pipelines that merely run a general purpose test suite.

I cannot fathom how and why I can refactor an application core code to satisfy a test runner issues.

See for instance: https://dev.azure.com/nexB/scancode-toolkit/_build/results?buildId=14699&view=logs&jobId=0ac92377-312b-57d4-67e8-5a0352399512&j=0ac92377-312b-57d4-67e8-5a0352399512&t=a25cc95d-7882-5ce7-6a26-d99affeaf5ac

...
E               Traceback (most recent call last):
E                 File "/home/vsts/work/1/s/src/scancode/interrupt.py", line 89, in interruptible
E                   create_signal(SIGALRM, handler)
E                 File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/signal.py", line 56, in signal
E                   handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler))
E               ValueError: signal only works in main thread of the main interpreter

@vijaykumar911 where can I file this problem?

JanDubcak commented 1 month ago

Well I just bailed on using Azure Portal for tests and switched to local hardware. I wanted to use azure so that I do not have to maintain hardware, but if maintaining hardware and communication with other teams is more simple, than forking open source project to accommodate (imho) flaw in signal handling of deployed containerized function, then I will do so.