microsoft / sql-server-language-extensions

language-extensions-sqlserver
Other
106 stars 42 forks source link

Python extension truncates output #33

Closed maXXis253 closed 1 year ago

maXXis253 commented 1 year ago

Python extension seems to truncate output string to 65535 characters. This is reproducible with SQL Server 2019 and Python 3.7.x and 3.9.x The code snippet below reproduces the issue, assuming external language myPython is created.

CREATE OR ALTER PROCEDURE dbo.P_TestPython AS BEGIN DECLARE @RequestStr VARCHAR(max)=replicate('A',10000), @ResponseStr VARCHAR(max), @i INT=1

WHILE @i<=4 BEGIN
    SELECT @RequestStr=@RequestStr+@RequestStr
    SET @i+=1
END
-- Prints 128000
select len(@RequestStr)

EXEC sp_execute_external_script
    @language=N'myPython',
    @script=N'

print(len(RequestStr)) ResponseStr = RequestStr ', @params=N'@RequestStr VARCHAR(MAX),@ResponseStr VARCHAR(MAX) OUTPUT', @RequestStr=@RequestStr, @ResponseStr=@ResponseStr OUTPUT

-- Prints 65535
select len(@ResponseStr)

END go EXEC dbo.P_TestPython go -- Output 128000,65535

Aniruddh25 commented 1 year ago

hi @maXXis253, we acknowledge the issue - seems to be something similar to what RExtension is facing. The fix seems to be simple and similar to what is in this PR: https://github.com/microsoft/sql-server-language-extensions/pull/13 Feel free to send a PR for PythonExtension as well

That said, we do soon plan to release new language extensions built on Python 3.10 and R 4.0.2. And, we now also provide capabilities to bring your own runtime in built with SQL 2022. See this: https://docs.microsoft.com/en-us/sql/machine-learning/install/sql-machine-learning-services-windows-install-sql-2022?view=sql-server-ver16#install-python

sandiptir commented 1 year ago

Hi Aniruddh, we are running into the same issue for Python and would really appreciate a revised connector from Microsoft for SQL 2019. We are not able to imply R PR #13 to Python. Thanks, Sandip

Aniruddh25 commented 1 year ago

Hi @sandiptir, @maXXis253, The fix lies in changing the extracted data type from int to __int64 bp::extract<__int64> here: https://github.com/microsoft/sql-server-language-extensions/blob/8323edabcb3de123477457741fec23a25531d640/language-extensions/python/src/PythonDataSet.cpp#L1516

Feel free to make the change and build the python extension.

We will soon provide a new release as well.

sandiptir commented 1 year ago

Thank you Aniruddh,

We are looking forward to a formal release from Microsoft for SQL Server 2019. In the meanwhile, we will use your suggestion.

maXXis253 commented 1 year ago

Hi @sandiptir, @maXXis253, The fix lies in changing the extracted data type from int to __int64 bp::extract<__int64> here:

https://github.com/microsoft/sql-server-language-extensions/blob/8323edabcb3de123477457741fec23a25531d640/language-extensions/python/src/PythonDataSet.cpp#L1516

Feel free to make the change and build the python extension.

We will soon provide a new release as well.

Hi @Aniruddh25,

I have made suggested code change, recompiled and redeployed, and the output is still truncated.

Aniruddh25 commented 1 year ago

hi @maXXis253, you are right, upon debugging your use case further, the output parameter is actually getting trimmed explicitly here:

https://github.com/microsoft/sql-server-language-extensions/blob/276da9809503b6c578b740c487990e93454c8cdd/language-extensions/python/src/PythonParam.cpp#L330

A quick fix for this would be to edit this line of code to truncate only when m_size is NOT max i.e. 65535, recompile and redeploy:

            if (m_value.size() > m_size && m_size < 65535)
            {
                m_value.resize(m_size);
                m_value.shrink_to_fit();
            }

We will be doing a formal release too.

maXXis253 commented 1 year ago

Hi @Aniruddh25, this fixes the issue. Thanks!