Closed pnathan closed 13 years ago
I have used this project to read/write fairly large CSVs but speed has never been a priority for me as it has always been fast enough, or slowed down enough by the other processes involved, that it seemed unneeded. I am certainly willing to optimize (or include patches that do), but probably not at the expense of general ease of use.
Do those proclamations actually increase the speed of those? I have used such proclamations before and have generally found them to be only a minor help (ie (for me) not worth worrying about). I am definitely interested in any input you can provide (eg through profiling) about where the slowdown is occurring.
Thanks, Russ
Here is my profile from sb-profile:report (it only goes down to the function level)-
measuring PROFILE overhead..done
seconds | gc | consed | calls | sec/call | name
-------------------------------------------------------------
8.862 | 0.060 | 1,054,208,944 | 145,770 | 0.000061 | CL-CSV:READ-CSV-ROW
0.072 | 0.000 | 0 | 1 | 0.071692 | CL-CSV:READ-CSV
0.000 | 0.000 | 0 | 1 | 0.000000 | RETURN-BLANKS
0.000 | 0.000 | 0 | 1 | 0.000000 | PROCESS-CSV
-------------------------------------------------------------
8.933 | 0.060 | 1,054,208,944 | 145,773 | | Total
estimated total profiling overhead: 0.12 seconds
overhead estimation parameters:
0.0s/call, 8.e-7s total profiling, 4.e-7s internal profiling
And now, from the statistical SBCL profiler, one ton of information.
One of the big checks is sequencep. I'm wondering if type decl wouldn't remove that.
Number of samples: 942
Sample interval: 0.01 seconds
Total sampling time: 9.42 seconds
Number of cycles: 0
Sampled threads:
#<SB-THREAD:THREAD "initial thread" RUNNING {10029311B1}>
Self Total Cumul
Nr Count % Count % Count % Calls Function
------------------------------------------------------------------------
1 106 11.3 106 11.3 106 11.3 - SB-KERNEL:SEQUENCEP
2 99 10.5 338 35.9 205 21.8 - CL-CSV::%ESCAPE-SEQ?
3 88 9.3 114 12.1 293 31.1 - ELT
4 80 8.5 132 14.0 373 39.6 - LENGTH
5 61 6.5 61 6.5 434 46.1 - SB-IMPL::OPTIMIZED-DATA-VECTOR-REF
6 57 6.1 57 6.1 491 52.1 - SB-VM::GENERIC-+
7 52 5.5 121 12.8 543 57.6 - CL-CSV::%CHAR-AT
8 50 5.3 50 5.3 593 63.0 - SB-KERNEL:HAIRY-DATA-VECTOR-REF/CHECK-BOUNDS
9 44 4.7 899 95.4 637 67.6 - CL-CSV:READ-CSV-ROW
10 40 4.2 42 4.5 677 71.9 - SB-IMPL::FD-STREAM-READ-N-CHARACTERS/UTF-8
11 34 3.6 34 3.6 711 75.5 - "foreign function pthread_sigmask"
12 30 3.2 168 17.8 741 78.7 - VECTOR-PUSH-EXTEND
13 22 2.3 51 5.4 763 81.0 - SB-KERNEL:STRING-SUBSEQ*
14 21 2.2 21 2.2 784 83.2 - SB-KERNEL:UB32-BASH-COPY
15 18 1.9 18 1.9 802 85.1 - SB-IMPL::SLOW-HAIRY-DATA-VECTOR-SET
16 17 1.8 84 8.9 819 86.9 - (LABELS CL-CSV::FINISH-ITEM)
17 13 1.4 13 1.4 832 88.3 - SB-IMPL::OPTIMIZED-DATA-VECTOR-SET
18 12 1.3 23 2.4 844 89.6 - REPLACE
19 11 1.2 11 1.2 855 90.8 - "foreign function getrusage"
20 11 1.2 11 1.2 866 91.9 - SB-KERNEL:HAIRY-DATA-VECTOR-SET
21 10 1.1 60 6.4 876 93.0 - SB-IMPL::ANSI-STREAM-READ-LINE-FROM-FRC-BUFFER
22 8 0.8 8 0.8 884 93.8 - STRING
23 8 0.8 8 0.8 892 94.7 - TRUNCATE
24 5 0.5 14 1.5 897 95.2 - MAKE-ARRAY
25 4 0.4 939 99.7 901 95.6 - (LAMBDA (SB-INT:&MORE SB-PROFILE::ARG-CONTEXT SB-PROFILE::ARG-COUNT))
26 4 0.4 52 5.5 905 96.1 - ADJUST-ARRAY
27 4 0.4 10 1.1 909 96.5 - SB-KERNEL:%CEILING
28 4 0.4 4 0.4 913 96.9 - SB-KERNEL:%WITH-ARRAY-DATA
29 3 0.3 3 0.3 916 97.2 - SB-KERNEL:CSUBTYPEP
30 3 0.3 3 0.3 919 97.6 - SB-IMPL::OPTIMIZED-DATA-VECTOR-REF
31 3 0.3 3 0.3 922 97.9 - SB-KERNEL:HAIRY-DATA-VECTOR-REF
32 2 0.2 2 0.2 924 98.1 - (FLET #:CLEANUP-FUN-[REFILL-INPUT-BUFFER]731)
33 2 0.2 2 0.2 926 98.3 - (LABELS SB-PROFILE::%INCF)
34 2 0.2 2 0.2 928 98.5 - STRING-EQUAL
35 1 0.1 938 99.6 929 98.6 - CL-CSV:READ-CSV
36 1 0.1 18 1.9 930 98.7 - SB-IMPL::DATA-VECTOR-FROM-INITS
37 1 0.1 4 0.4 931 98.8 - SB-KERNEL:VALUES-SPECIFIER-TYPE
38 1 0.1 3 0.3 932 98.9 - (LABELS ROW-CHECKER)
39 1 0.1 1 0.1 933 99.0 - NTHCDR
40 1 0.1 1 0.1 934 99.2 - SB-IMPL::%VECTOR-WIDETAG-AND-N-BITS
41 1 0.1 1 0.1 935 99.3 - "foreign function closure_tramp"
42 1 0.1 1 0.1 936 99.4 - (LABELS SB-IMPL::SXHASH-RECURSE)
43 1 0.1 1 0.1 937 99.5 - GET-INTERNAL-RUN-TIME
44 1 0.1 1 0.1 938 99.6 - SB-KERNEL:%SET-FILL-POINTER
45 1 0.1 1 0.1 939 99.7 - SB-IMPL::GET-NEW-FILL-POINTER
46 1 0.1 1 0.1 940 99.8 - SB-KERNEL:SET-ARRAY-HEADER
47 1 0.1 1 0.1 941 99.9 - SB-IMPL::%WALK-DISPLACED-ARRAY-BACKPOINTERS
48 1 0.1 1 0.1 942 100.0 - SB-INT:EQUAL-BUT-NO-CAR-RECURSION
49 0 0.0 939 99.7 942 100.0 - PROCESS-CSV
50 0 0.0 939 99.7 942 100.0 - (EVAL (SB-SPROF:WITH-PROFILING (:MAX-SAMPLES 10000 :REPORT :FLAT :LOOP NIL)))
51 0 0.0 939 99.7 942 100.0 - SB-INT:SIMPLE-EVAL-IN-LEXENV
52 0 0.0 939 99.7 942 100.0 - EVAL
53 0 0.0 939 99.7 942 100.0 - INTERACTIVE-EVAL
54 0 0.0 939 99.7 942 100.0 - SB-IMPL::REPL-FUN
55 0 0.0 939 99.7 942 100.0 - "Unknown component: #x10000A2750"
56 0 0.0 939 99.7 942 100.0 - SB-IMPL::%WITH-REBOUND-IO-SYNTAX
57 0 0.0 939 99.7 942 100.0 - SB-IMPL::TOPLEVEL-REPL
58 0 0.0 939 99.7 942 100.0 - SB-IMPL::TOPLEVEL-INIT
59 0 0.0 939 99.7 942 100.0 - (FLET #:WITHOUT-INTERRUPTS-BODY-[RESTART-LISP]30)
60 0 0.0 939 99.7 942 100.0 - (LABELS SB-IMPL::RESTART-LISP)
61 0 0.0 70 7.4 942 100.0 - (LABELS CL-CSV::READ-LINE-IN)
62 0 0.0 42 4.5 942 100.0 - SB-INT:FAST-READ-CHAR-REFILL
63 0 0.0 34 3.6 942 100.0 - "foreign function interrupt_handle_pending"
64 0 0.0 9 1.0 942 100.0 - (FLET #:CLEANUP-FUN-[EXIT-TAG-[PROFILE-ENCAPSULATION-LAMBDAS]171]178)
65 0 0.0 8 0.8 942 100.0 - SUBTYPEP
66 0 0.0 5 0.5 942 100.0 - SB-KERNEL:SPECIFIER-TYPE
67 0 0.0 3 0.3 942 100.0 - (LABELS ROW-PROCESSOR)
68 0 0.0 2 0.2 942 100.0 - SB-IMPL::REFILL-INPUT-BUFFER
69 0 0.0 1 0.1 942 100.0 - REDUCE
70 0 0.0 1 0.1 942 100.0 - SB-PROFILE::INCF-COUNTER
------------------------------------------------------------------------
0 0.0 elsewhere
Cool, thanks for the profiling!
I am seeing:
What filesize (in bytes) are the files you are trying to speed up? I just want to make sure that I am testing similar situations
Also note that your proclamations are currently affecting your entire lisp system (they are global not local).
http://www.lispworks.com/documentation/HyperSpec/Body/f_procla.htm
I added some basic type checks to %char-at, %escape-seq?, and read-csv-row, shaving the time down to about 7.7 seconds. The number of bytes consed is 1,051,897,728 bytes consed
, which I suspect to be the key issue here.
I ran another check, checking to see about time coming from READ-LINE, which is not the fastest cat in the house.
seconds | gc | consed | calls | sec/call | name
-------------------------------------------------------------
7.313 | 0.180 | 745,290,272 | 145,770 | 0.000050 | CL-CSV:READ-CSV-ROW
0.782 | 0.090 | 306,590,960 | 145,770 | 0.000005 | READ-LINE
0.000 | 0.000 | 0 | 1 | 0.000000 | CL-CSV:READ-CSV
0.000 | 0.000 | 0 | 1 | 0.000000 | PROCESS-CSV
0.000 | 0.000 | 0 | 1 | 0.000000 | RETURN-BLANKS
-------------------------------------------------------------
8.095 | 0.270 | 1,051,881,232 | 291,543 | | Total
I am currently running:
(in-package :cl-csv-test)
(declaim (optimize (speed 3) (safety 0) (debug 0)))
(defun process-csv (csv-name process-function )
(let ((collector))
(declare (type list collector))
(labels ((row-processor (row)
(let ((result
(the list (funcall (the function process-function) row))))
(declare (type list result))
(if result
(push result collector)))))
(cl-csv:read-csv csv-name :row-fn #'row-processor)
collector)))
(defun return-blanks (filename)
(labels ((row-checker (row)
(declare (type list row))
(if (string-equal (the string (sixth row)) "")
(list (first row)
(sixth row)))))
(process-csv filename #'row-checker)))
(defun test-pnathan-code ()
(time-and-log-around (test-log "test-pnathan read")
(return-blanks +test-big-file+)))
This is the same code with none of the declarations (in a separate file so not affected by the declaims)
(defun process-csv2 (csv-name process-function )
(let ( collector )
(labels ((row-processor (row)
(let ((result (funcall process-function row)))
(when result
(push result collector)))))
(cl-csv:read-csv csv-name :row-fn #'row-processor)
collector)))
(defun return-blanks2 (filename)
(labels ((row-checker (row &aux (6th (sixth row)) )
(if (string-equal 6th "")
(list (first row) (sixth row)))))
(process-csv2 filename #'row-checker)))
(defun test-pnathan-code2 ()
(time-and-log-around (test-log "test-pnathan read")
(return-blanks2 +test-big-file+)))
Running these two back to back I get that the one without declarations is the same speed or a touch faster:
16:07:38 BEGIN test-pnathan read
16:07:42 END test-pnathan read
Evaluation took:
3.573 seconds of real time
3.490000 seconds of total run time (3.280000 user, 0.210000 system)
[ Run times consist of 0.260 seconds GC time, and 3.230 seconds non-GC time. ]
97.68% CPU
8,908,976,325 processor cycles
229,370,544 bytes consed
16:07:42 BEGIN test-pnathan2 read
16:07:45 END test-pnathan2 read
Evaluation took:
3.410 seconds of real time
3.320000 seconds of total run time (3.210000 user, 0.110000 system)
[ Run times consist of 0.200 seconds GC time, and 3.120 seconds non-GC time. ]
97.36% CPU
8,502,469,860 processor cycles
229,333,696 bytes consed
It seems like the proclamations are not helping your code run faster, but across the whole system perhaps they are. However running the whole image at speed 3 / safety 0 can have bad consequences for code not prepared for this.
To characterize the worst-case csv:
54,532,563 bytes 145,769 lines 38 columns
The internal data is company confidential, so I am unable to give you a copy file.
Edit: Sorry about the close/reopen, hit the wrong button.
This is my first time seriously working on optimizing Lisp code - I didn't know Proclaim was global. It... could be problematic for some code. :-/
Just wanted to let you know that I added a minor optimization, but I don't think this will increase your speed too much (in my brief testing it was ~10-20%) faster.
I just now see that you tried making the same functions a bit better. I think most of the improvement I squeezed out in the last patch was in regards to length checking (calling length once instead of mulitple times). In general I have not found type declarations to help matters nearly as much as removing duplicate effort. It has seemed to me that the SBCL compiler does a good job of correctly declaring types.
One thing that I know of that will increase speed (if your data allows) is by ensuring that we are getting simple-strings as opposed to strings. Currently asking on #lisp how to do this
Actually looking into this further read-line seems to be returning simple-strings by default on my box, so probably nothing else to get there,
Went back to declarations and (more) length call reduction in the tight / per character loop by taking your above proposition of type declarations in escape? and that seemed to shave off another ~10%.
I'm done for the day. I hope this helps make your life a bit easier. Please keep me informed about any further optimizations that you find.
Cheers, Russ
Russ,
Thanks for your hard work. Looks like the current edition from git runs in ~60% of the time I was originally getting.
5.345 seconds of real time
5.340000 seconds of total run time (5.290000 user, 0.050000 system)
[ Run times consist of 0.150 seconds GC time, and 5.190 seconds non-GC time. ]
99.91% CPU
15,994,870,092 processor cycles
1,051,878,176 bytes consed
Hi,
I've been working with largish CSV files (> 10K lines), and speed is getting to be a critical priority.
I grabbed the latest quicklisp distro of cl-csv, and despite all my type assertions in SBCL, I can not speed it up as fast as I really want it to be.
My approach here is to stream the file through and only pick out the datapoints I need.
Using this mechanism, I complete the batch job on my Big File (has 145K lines) in 8.4 seconds in Common Lisp, 3.7 seconds in Haskell, and 0.7 in CPython
I looked around the cl-csv code a bit and noticed it didn't seem very tuned. Is this my eyes or has it not had a signfiicant optimization pass on it?
I wouldn't mind collaborating to tune CL-CSV to optimize it.
I've included my code here so you can see what my approach is.