abhishek-ram / django-pyas2

AS2 file transfer Server built on Python and Django.
https://django-pyas2.readthedocs.io
GNU General Public License v3.0
78 stars 31 forks source link

MSSQL 'payload' FileField max_length restrictions #107

Open jens-Github opened 4 months ago

jens-Github commented 4 months ago

We are developing with an MSSQL backed database where there are limitations on how some datatypes are stored for example a VARCHAR can only go up to a maximum of 4000. See below error for more information (related to commit debfd6a):

pyodbc.ProgrammingError: ('42000', "[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The size (4096) given to the parameter 'payload' exceeds the maximum allowed (4000). (2717) (SQLExecDirectW)")

chadgates commented 4 months ago

Are you using https://github.com/microsoft/mssql-django? SQL Server would allow a longer nvarchar field, if it was typed as NVARCHAR(MAX), but I guess this should be handled by the mssql-django project to make this correctly. Maybe this one is related: https://github.com/microsoft/mssql-django/issues/317 and may help you resolve this.

jens-Github commented 4 months ago

Thank you for your response. I am indeed using mssql-django (and pyodbc)

However, I am not really sure if the issue is a part of the mssql-django project as per following information: (In short FileField is automaticaly mapped to an VARCHAR(n) or NVARCHAR(n) by Django if I understand correctly, and not really a fan of monkey patching tbh)

The 4000 character limit is not due to Python, Django, or any Python library, but it's a limitation imposed by Microsoft SQL Server itself for 'NVARCHAR' and 'VARCHAR' data types.

When you declare a VARCHAR(n) or NVARCHAR(n) type in SQL Server, n can be a value from 1 through 4,000. If a larger size is needed, VARCHAR(MAX) or NVARCHAR(MAX) must be used, which can store up to 2^31-1 characters (~2GB).

Thus, no matter what language or library you're using to interact with SQL Server, you're bound to encounter this limit if you're using VARCHAR(n) or NVARCHAR(n) fields.

If you're using a library to map Python objects to SQL Server database tables (like Django ORM does), then it depends on whether that library supports NVARCHAR(MAX) and VARCHAR(MAX), and how it implements that support. In the case of Django, the CharField and FileField types are implemented as NVARCHAR(n) on SQL Server, which leads to the current issue.

The Django FileField uses Django's CharField under the hood to store the path to the file. However, the SQL field type created in the database can vary depending on the database system. Here's a quick breakdown:

While creating the model in Django, CharField or FileField require max_length to be specified. This max_length attribute is then used to define the field length in the underlying SQL field type - like VARCHAR(n) or NVARCHAR(n). Regardless of the database system used, it's recommended to ensure the max_length attribute for FileField or its equivalent is set to an appropriate value considering the file path lengths in your application.

chadgates commented 4 months ago

To my knowledge, a linux file system has a maximum path size of 4096 characters. Therefore, the value in the model represents what the maximum file path length could be.

The logic of how a django field type is applied to the database is with the implementation of django-mssql, not with django, nor with django-pyas2, which is in agreement to what you stated: "If you're using a library to map Python objects to SQL Server database tables (like Django ORM does), then it depends on whether that library supports NVARCHAR(MAX) and VARCHAR(MAX), and how it implements that support."

The "library" in this case is django-mssql and therefore the issue should be tackled there. No?

jens-Github commented 4 months ago

Dear chadgates and Team,

I wanted to take a moment to express my sincere gratitude for your assistance in helping us navigate the issue we've been experiencing with django-pyas2. Your insight and guidance have been incredibly beneficial.

In our quest to find a solution, we've decided upon your recommendation and directed this matter to the mssql-django repository. We have raised this issue, describing the encountered behavior. As we continue to work towards resolving this, your advice has been instrumental in guiding our action steps.

We're hopeful that taking this to the mssql-django team will help us get closer to a resolution, or at least confirm if this is a known limitation with the Django MS SQL Server database backend.

Again, thank you so much for all of your support. We genuinely appreciate the time and expertise you've dedicated to assist us.

PS: many thanks as well for the django-pyas2 package itself; it is a really nice package!

Best regards, Jens