Open Speccles96 opened 4 years ago
Hi @Speccles96, thanks for posting! I agree this would be a really good feature for the community.
I will flag as Help Wanted until we get the chance to implement it ourselves. I/O is fairly abstract in Modin so familiarity with the codebase is probably necessary to implement this.
Looking into the pandas sas code, it looks like we'd need to vendor and adapt a bunch of it. Effectively we'd need to add a skip_rows-like arg to the low-level reader. Might be worth trying to get that as a feature in pandas so we don't have to maintain the extra code here.
Was trying to use read_sas with modin and received this message:
There are no other python packages that allow you to parallelize read_sas with out some difficult work around. Would be a nice feature to add for the community and for the folks that use .sas7bdat files regularly.