Closed jonblunt closed 11 years ago
use index instead of major_axis (that example is a panel)
eg 'index>=20120101'
u can also do
Term('index','>',date)
where date is a date like object ( or string)
just so u know deleting is not fast
and every once in a while run ptrepack on the file
a faster way might be to accumulate your rows for say today in a new table (just append them) then say at the end of the day read today's table in (which will only read the last of duplicate rows, and hence is slower when dups r there), and then append in one shot to your original table
but should be faster than delete/ append off of main table
Got the remove to work. Thanks
Batches are several days, usually eight days day received every seven days. The data set now covers several years of activity. The collecting process is not completely standardized and, unfortunately, it is possible some of the last rows in the batch represent partial work. Those will be trued up in the next transmission. Hence my approach.
I will now think about how to stop partial records getting loaded in the first place.
jb
.
great
ok 2 close?
Yes .
Thanks
On Thu, Apr 18, 2013 at 7:53 PM, jreback notifications@github.com wrote:
ok 2 close?
— Reply to this email directly or view it on GitHubhttps://github.com/pydata/pandas/issues/3396#issuecomment-16622049 .
I have a hd5store that accumulates data by batch. However the batches overlap and I want to just add the new rows..
The data is in a table with a major index of event timestamp. I can assume the new batch has complete data for the first day so all I need to do is to remove all rows with a timestamp >= min_new_date.
I am using python 2.7, pandas 0.11.0.dev-b41dc91 on Windows 7 64bit and lget the following error however I tried to pass in the date In this example I hard code the date string to simplify the example.
dstore = pyt.HDFStore(hdfile)
dstore Out[39]: <class 'pandas.io.pytables.HDFStore'> File path: Y:\HAVI_MONITORING\Log_Scan_Files\ROPLogHistComp2.h5 /files series (shape->[1])
/logdata frame_table (typ->appendable_multi,nrows->1148964,ncols->13,indexers->[index],dc->[FileId,Start,Wob,Start_date,Type_1]) /meta frame (shape->[1,1])
/stores frame_table (typ->appendable,nrows->1123,ncols->20,indexers->[index])
dstore.remove('logdata','major_axis>=20130327')
Exception Traceback (most recent call last)