sassoftware / saspy

A Python interface module to the SAS System. It works with Linux, Windows, and Mainframe SAS as well as with SAS in Viya.
https://sassoftware.github.io/saspy
Other
373 stars 150 forks source link

Writing large datasets (> 1 mil records) from python dataframe to sas dataset using df2sd results in random data loss on SAS M7 server with multibyte encoding (UTF8) #342

Closed snehak0991 closed 3 years ago

snehak0991 commented 3 years ago

Hi, I am trying to write a python dataframe to a sas dataset using a remote SAS M7 server with multibyte encoding (UTF8) using the iomwin protocol. This python dataframe is fairly large (over 1 million records) and when it writes to sasdataset, it results in random data loss i.e. the number of records written to SAS dataset vary each time the python script is run and the sas dataset doesn't reflect the complete row count during any of the runs.

This issue is only observed when writing large dataframes. My other script that writes a smaller dataframe (~35k records) works fine and writes all records to the sas dataset

Please note: The df2sd method converts the dataframe to sas dataset and this dataset is downloaded from sas work library to a local disk location

SAS session: Access Method = IOM SAS Config name = iomwin SAS Config file = /home/username/.virtualenvs/vbase3/lib/python3.8/site-packages/saspy/sascfg_personal.py WORK Path = E:\SASWORK\svcSasTask_TD19848PAPP84SAS03\Prc2\ SAS Version = 9.04.01M7P08052020 SASPy Version = 3.3.7 Teach me SAS = False Batch = False Results = Pandas SAS Session Encoding = utf-8 Python Encoding value = UTF8 SAS process Pid value = 19848

Expected behavior All records in python dataframe was written to the SAS dataset

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

snehak0991 commented 3 years ago

My SASPY version needed upgrading which solved the issue

tomweber-sas commented 3 years ago

Hey, I'm glad you tried the latest release; and that it already had this fixed! I've made a number of enhancements and fixes to df2sd in the past few releases. Let me know if you need anything else! Thanks, Tom