Per https://github.com/ursacomputing/arrowbench/pull/87#discussion_r865188670 harmonize the full and sample Fannie Mae sources. Currently the sample has a bunch of null columns (it has 108 cols of which 61 have non-null data; the full dataset has 31 cols of which 29 are non-null) and neither has column names, so operations that rely on names generated from positions fail.
After this story, the only difference should be the number of rows. If we can find some real column names, that would be ideal.
Per https://github.com/ursacomputing/arrowbench/pull/87#discussion_r865188670 harmonize the full and sample Fannie Mae sources. Currently the sample has a bunch of null columns (it has 108 cols of which 61 have non-null data; the full dataset has 31 cols of which 29 are non-null) and neither has column names, so operations that rely on names generated from positions fail.
After this story, the only difference should be the number of rows. If we can find some real column names, that would be ideal.