Closed bolrDK closed 2 years ago
I think this might be fixed by #64 and is meged into master
. Try using the latest version from GitHub.
Thanks - that helped. Now I only wonder why the xport.v56.dump function gives all the conversion warnings like:
warnings.warn(f'Converting column dtypes {conversions}') Converting column 'STUDYID' from string to string
even though I have changed the object type to string for all relevant colums in the dataframe.
Not a clue. If you figure it out, let me know!
Python version: 3.8.6 xport version: 3.2.1
When trying to xport more than 60 rows to xport, I get an error message 'NotImplementedError: Can't copy SAS variable metadata to dataframe'. It's independant of how many variables I include - and whether I use the xport.v56.dump or xport.from_columns functions.
I have afterwards installed xport v2.0.2 and can use xport.from_columns functions any number of input rows that I need with no problems.
I have created a small test script to illustrate the problem - see the console output from executions with 60 and 61 input rows respectively below:
df1 = pd.DataFrame({'COL': ['01','02','03','04','05','06','07','08','09','10', '11','12','13','14','15','16','17','18','19','20', '21','22','23','24','25','26','27','28','29','30', '31','32','33','34','35','36','37','38','39','40', '41','42','43','44','45','46','47','48','49','50', '51','52','53','54','55','56','57','58','59','60']}) ds1 =xport.Dataset(df1, name='test1') with open('c:/temp/test1.xpt', 'wb') as f: xport.v56.dump(xport.Library({'test1': ds1}),f) c:\users\bolr\programs\python38\lib\site-packages\xport\v56.py:630: UserWarning: Converting column dtypes {'COL': 'string'} warnings.warn(f'Converting column dtypes {conversions}') Converting column 'COL' from object to string
df2 = pd.DataFrame({'COL': ['01','02','03','04','05','06','07','08','09','10', '11','12','13','14','15','16','17','18','19','20', '21','22','23','24','25','26','27','28','29','30', '31','32','33','34','35','36','37','38','39','40', '41','42','43','44','45','46','47','48','49','50', '51','52','53','54','55','56','57','58','59','60', '61']}) ds2 =xport.Dataset(df2, name='test2') with open('c:/temp/test2.xpt', 'wb') as f: xport.v56.dump(xport.Library({'test2': ds2}),f) Traceback (most recent call last):
File "", line 10, in
xport.v56.dump(xport.Library({'test2': ds2}),f)
File "c:\users\bolr\programs\python38\lib\site-packages\xport\v56.py", line 932, in dump fp.write(dumps(library))
File "c:\users\bolr\programs\python38\lib\site-packages\xport\v56.py", line 951, in dumps return bytes(Library(library))
File "c:\users\bolr\programs\python38\lib\site-packages\xport\v56.py", line 727, in bytes b'members': b''.join(bytes(Member(member)) for member in self.values()),
File "c:\users\bolr\programs\python38\lib\site-packages\xport\v56.py", line 727, in
b'members': b''.join(bytes(Member(member)) for member in self.values()),
File "c:\users\bolr\programs\python38\lib\site-packages\xport__init.py", line 470, in init__ self.copy_metadata(data)
File "c:\users\bolr\programs\python38\lib\site-packages\xport__init__.py", line 412, in copy_metadata for k, v in self.items():
File "c:\users\bolr\programs\python38\lib\site-packages\pandas\core\frame.py", line 957, in items yield k, self._get_item_cache(k)
File "c:\users\bolr\programs\python38\lib\site-packages\pandas\core\generic.py", line 3539, in _get_item_cache res = self._box_col_values(values, loc)
File "c:\users\bolr\programs\python38\lib\site-packages\pandas\core\frame.py", line 3187, in _box_col_values return klass(values, index=self.index, name=name, fastpath=True)
File "c:\users\bolr\programs\python38\lib\site-packages\xport__init.py", line 310, in init__ LOG.debug(f'Initialized {self}')
File "c:\users\bolr\programs\python38\lib\site-packages\xport__init.py", line 276, in repr return f'{type(self).name}\n{super().repr__()}\n{", ".join(metadata)}'
File "c:\users\bolr\programs\python38\lib\site-packages\pandas\core\series.py", line 1315, in repr self.to_string(
File "c:\users\bolr\programs\python38\lib\site-packages\pandas\core\series.py", line 1374, in to_string formatter = fmt.SeriesFormatter(
File "c:\users\bolr\programs\python38\lib\site-packages\pandas\io\formats\format.py", line 261, in init self._chk_truncate()
File "c:\users\bolr\programs\python38\lib\site-packages\pandas\io\formats\format.py", line 285, in _chk_truncate series = concat((series.iloc[:row_num], series.iloc[-row_num:]))
File "c:\users\bolr\programs\python38\lib\site-packages\pandas\core\reshape\concat.py", line 274, in concat op = _Concatenator(
File "c:\users\bolr\programs\python38\lib\site-packages\pandas\core\reshape\concat.py", line 395, in init axis = sample._constructor_expanddim._get_axis_number(axis)
File "c:\users\bolr\programs\python38\lib\site-packages\xport__init__.py", line 340, in _constructor_expanddim raise NotImplementedError("Can't copy SAS variable metadata to dataframe")
NotImplementedError: Can't copy SAS variable metadata to dataframe