Closed sammosummo closed 4 years ago
Rats. I'll take a look this evening.
@sammosummo looks like a problem with that particular file, rather than the code dependencies. Are you able to share it?
That makes sense. Not at liberty to distribute them, but there are several here.
I tried the top one alphabetically, Acculturation / ACQ_J, and couldn't reproduce the issue. Could you point me to one that failed?
While testing, I fixed a different issue (#40), so reporting this was helpful already.
I've been playing with some of the files here: https://github.com/phuse-org/phuse-scripts/tree/master/data/send (./PointCross/lb.xpt has the following problem below)
I was trying to generate some larger volume files to test and using this to multiply records from an existing file and create a new XPT. I'm also hitting the recursion-depth error on dumping. Here is my code: with open(inputFile,'rb') as inFile: for dataset in library.items(): library=xport.Library({dataset[0]:dataset[1]}) with open(outputFile,'wb') as outFile: xport.v56.dump(library,outFile)
(just a test, taking the datasets and outputting them back into another file)
Results in the error:
Traceback (most recent call last):
File "gen-big-xpt.py", line 89, in
.... the stack is huge....
data = self._format_data()
File "/usr/local/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 938, in _format_data self, self._formatter_func, is_justify=is_justify, name=name File "/usr/local/lib/python3.7/site-packages/pandas/io/formats/printing.py", line 318, in format_object_summary displaywidth, = get_console_size() File "/usr/local/lib/python3.7/site-packages/pandas/io/formats/console.py", line 16, in get_console_size display_width = get_option("display.width") File "/usr/local/lib/python3.7/site-packages/pandas/_config/config.py", line 231, in call return self.func(*args, **kwds) File "/usr/local/lib/python3.7/site-packages/pandas/_config/config.py", line 102, in _get_option key = _get_single_key(pat, silent) RecursionError: maximum recursion depth exceeded
Hitting the same(or similar) recursive depth error with the following: `import pandas as pd import xport import xport.v56
datasets={} for i in range(1,10): values1=[] values2=[] for j in range(1,10000000): values1.append(j) values2.append('values'+str(i)) df = pd.DataFrame({ 'alpha'+str(i): values1, 'beta'+str(i): values2, })
ds = xport.Dataset(df, name='DATA'+str(i), label='Wonderful data '+str(i))
for k, v in ds.items():
v.label = k # Use the column name as SAS label
v.name = k.upper()[:8] # SAS names are limited to 8 chars
if v.dtype == 'object':
v.format = '$CHAR20.' # Variables will parse SAS formats
else:
v.format = '10.2'
datasets['DATA'+str(i)] = ds
library = xport.Library(datasets)
with open('example.xpt', 'wb') as f: xport.v56.dump(library, f) `
but this works fine: `import pandas as pd import xport import xport.v56
datasets={} for i in range(1,10): values1=[] values2=[] for j in range(1,10): values1.append(j) values2.append('values'+str(i)) df = pd.DataFrame({ 'alpha'+str(i): values1, 'beta'+str(i): values2, })
ds = xport.Dataset(df, name='DATA'+str(i), label='Wonderful data '+str(i))
for k, v in ds.items():
v.label = k # Use the column name as SAS label
v.name = k.upper()[:8] # SAS names are limited to 8 chars
if v.dtype == 'object':
v.format = '$CHAR20.' # Variables will parse SAS formats
else:
v.format = '10.2'
datasets['DATA'+str(i)] = ds
library = xport.Library(datasets)
with open('example.xpt', 'wb') as f: xport.v56.dump(library, f)`
After some quick testing, it seems to break at this cutoff: for j in range(1,61): to for j in range(1,62):
Ignore my last two comments, I just found: def copy_metadata(self, other): """ Copy metadata from another Variable. """
Commenting out that line or pulling your latest fixes my problem.
@bunk1978 Did I mess up the PyPI upload? It looks like it's synchronized with GitHub master
.
Looks like I did, in fact, accidentally leave that recursion bug in there. Fixed.
Thanks! Sorry I didn't respond faster.
Just a heads up for anyone who ends up here. Make sure that once you trim the chars to just 8 using something like below, you still have unique column names. Or you might hit a RecursionError: maximum recursion depth exceeded
.
ds = ds.rename(columns={k: k.upper()[:8] for k in ds})
@meain Doh! That's fun. Do you mind opening a new issue and pasting a traceback in there?
@selik opened https://github.com/selik/xport/issues/61
Installed in a fresh conda environment:
Tried to convert one file to another via
xport file1.xpt > file2.csv
.Got an enormous error traceback, ending with: