giuse88 / duka

duka - Dukascopy historical data downloader
http://giuse88.github.io/duka
MIT License
302 stars 135 forks source link

Enhancement - Combine Files to Weekly/Monthly #1

Closed crazy25000 closed 8 years ago

crazy25000 commented 8 years ago

Hey,

I'm writing a script to combine all the files into monthly, but just wanted to recommend this feature built-in.

giuse88 commented 8 years ago

Hi @crazy25000

Yeah, I could add a command line option allowing to specify how to aggregate the data. For example:

 --daily 
 --monthly
 --weekly
crazy25000 commented 8 years ago

@giuse88 that would be perfect! I also noticed the timestamps are not correct, they only show minutes/seconds/ticks, but no hours.

giuse88 commented 8 years ago

@crazy25000 It was a bug. It is now fixed if you update to 0.1.2.

crazy25000 commented 8 years ago

@giuse88 awesome, thank you!

crazy25000 commented 8 years ago

Here is what I'm using to combine files. Memory usage gets really high when manipulating large amount of rows so I don't manipulate data and combine until after all individual files get downloaded.

Step 1 is not posted because it's just for me to 'clean' the files.

import glob
import os

mypath = '/home/user/'
allCSVfiles = glob.glob(mypath + "*.csv")
combinedCSV = mypath + 'output.csv'
header_saved = False
totalFiles = len(allCSVfiles)
count = 1

print '..::| Step 2: Combine files |::..'
with open(combinedCSV,'wb') as fout:
    count = 1
    for filename in sorted(allCSVfiles):
        print '        ', str(count) + '/' + str(totalFiles)
        with open(filename) as fin:
            header = next(fin)
            if not header_saved:
                fout.write(header)
                header_saved = True
            for line in fin:
                fout.write(line)
        count+=1

print '..::| Step 3: Delete old files |::..'
for filename in sorted(allCSVfiles):
    if combinedCSV not in filename:
        os.remove(filename)
giuse88 commented 8 years ago

I see. Thank you for sharing your code... I am actually fixing this problem in the develop branch. The new release will dump the file onto one single file. In addition to that, I am also adding support for candles with different time frame. I hope to get it done this weekend.

This is the PR for the new release https://github.com/giuse88/duka/pull/7 @crazy25000

giuse88 commented 8 years ago

@crazy25000 Fixed in 0.1.6

crazy25000 commented 8 years ago

Awesome, thank you @giuse88!