Closed LemonBoy closed 8 years ago
The use of mmap
brings the time for the conversion of a single timeframe from > 12m to 6.40m (the run time for a single csv drops from 0.208s
to 0.08s
) on this machine, at the expense of being able to use stdin as input (that wasn't working well anyway as seen in #25, probably due to some buffering issue linked to the use of readline).
That's a non issue since a shared map is created and it should be shared among the various processes mapping the same file, more than a single timeframe could be converted at once to get well past the 50m
mark
The latest commit manages to convert the whole EURUSD-2014
dataset in less than 10m
make 484.43s user 0.27s system 99% cpu 8:04.77 total
At this point I'm not sure if I did optimize the code or have somehow killed the dataset heh
Please remove the line [1], it's a brainfart
[1] https://github.com/FX31337/FX-BT-Scripts/blob/master/convert_csv_to_mt.py#L244
Can you quote the line? Currently it's pointing to:
for tick in ticks:
That's the line, the branch wasn't quite ready for getting merged into master :(
That's fine, it was merged to re-validate with other PR aiming at the performance to avoid any conflicts, you can send the next PR for the speed up and fixes.
Moved the changes to performance
branch (#54), so you may apply the fixes there.
I do see a 2x speedup, it might (or might not) be enough for #37