blaze / castra

Partitioned storage system based on blosc. **No longer actively maintained.**
BSD 3-Clause "New" or "Revised" License
153 stars 21 forks source link

"." columns #41

Closed enku closed 9 years ago

enku commented 9 years ago

I read your post "Efficient Tabular Storage" and decided to to give Castra a whirl. As my luck would have it, I got an error on the very first data set I tried.

The problem is that the dataframe has a column that is labeled "." (period, without quotes) and, of course, on *nix this is an invalid name for regular files. I don't know if/how you would want to fix this. I've written a test case nontheless.

def test_column_with_period():                                                                      
    df = pd.DataFrame({'x': [10, 20],                                                               
                       '.': [10., 20.]},                                                            
                      columns=['x', '.'],                                                           
                      index=[10, 20])                                                               

    with Castra(template=df) as c:                                                                  
        c.extend(df) 

Which yields:

>       with open(filename, 'wb') as fp:
E       IOError: [Errno 21] Is a directory: '/tmp/castra-uUz3Gr/10--20/.'
mrocklin commented 9 years ago

Hi @enku thanks for trying it out. I hope that it can (eventually) be useful for you. There is a fix for this immediate issue in #42. It could be improved though if you have suggestions. (thanks also, for the test.)