Closed jkteske closed 4 years ago
Good catch, I think you are correct.
I usually just use pandas to load this file since read_csv
can read directly as a bz2, e.g.:
import pandas as pd
chains = pd.read_csv('HD21749_10_chains.csv.tar.bz2')
Fixed in v1.3.7 PR #300
I tried to open the XX_chains.csv.tar.bz2: tar -xvjf HD21749_10_chains.csv.tar.bz2
and got the following error:
tar: Unrecognized archive format tar: Error exit delayed from previous errors.
Then I tried just unzipping it and that worked: bzip2 -d HD21749_10_chains.csv.tar.bz2 --> HD21749_10_chains.csv.tar
So I concluded there is something wrong with the tarring of the file.
I tried just removing the "tar" in the file name, e.g HD21749_10_chains.csv.tar.bz2 --> HD21749_10_chains.csv.bz2
and then double clicked on the file (or used the bzip2 -d command above), and that produces a (what looked to be correct) csv file.
So, I think the chains files are just not tarred at all? I did not see a part in the code where the tarring happens, just the zipping. Am I missing something?
Thanks!