the-nightling / cudpp

Automatically exported from code.google.com/p/cudpp
Other
0 stars 0 forks source link

Investigate why CUDPP libraries have gotten so large #18

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
The release libcudpp.a on OS X is over 110 MB now.  On linux it is over 34 MB.  
What is causing this?

Original issue reported on code.google.com by harr...@gmail.com on 25 Jun 2009 at 12:15

GoogleCodeExporter commented 9 years ago
On my MacOSx it is not 110 MB but 36 MB

Here's a breakup what takes how much size (bytes) on Mac OSX
scan             8396256
compact        175028
segscan     25003640
spmvmult      151168
radixsort     3735368
rand               155888

Original comment by shu...@gmail.com on 26 Jun 2009 at 1:02

GoogleCodeExporter commented 9 years ago
Thanks.  How did you figure that out?

Original comment by harr...@gmail.com on 26 Jun 2009 at 2:07

GoogleCodeExporter commented 9 years ago
Added cu file to the Makefile in cudpp one at a time. That gave me a cumulative 
sum - so subtracted between 
two consecutive sizes of the library to get the data.

Original comment by shu...@gmail.com on 26 Jun 2009 at 2:11

GoogleCodeExporter commented 9 years ago

Original comment by harr...@gmail.com on 29 Jun 2009 at 7:51

GoogleCodeExporter commented 9 years ago
Just a comment to help users who have issues with this problem.  If you need to 
reduce the CUDPP library binary size, you can comment out generation of the 
template 
kernels you don't need.  In any of the *_app.cu files there is a Dispatch 
function 
for the corresponding algorithm.  To optimize performance we have to use a 
large 
switch/if-else to dispatch at run time the appropriate compile-time optimized 
template kernel function.  To reduce compiled object size and also compile 
time, you 
can simply comment out the switch options that you don't need.

For example, if you don't need segmented scan, comment out everything inside 
cudppSegmentedScanDispatch(): 
http://code.google.com/p/cudpp/source/browse/tags/1.1/cudpp/src/app/segmented_sc
an_ap
p.cu#386

Then, if you only need forward exclusive integer +-scans, comment out 
everything but 
the lines that invoke that type of scan:

http://code.google.com/p/cudpp/source/browse/tags/1.1/cudpp/src/app/scan_app.cu#
446

Your compile time and file size will be greatly reduced.

Perhaps the solution to this problem is to make compilation configurable in 
some way.  

Original comment by harr...@gmail.com on 10 Dec 2009 at 10:40

GoogleCodeExporter commented 9 years ago

Original comment by harr...@gmail.com on 6 Jul 2011 at 2:36

GoogleCodeExporter commented 9 years ago
Ha - my library is 280MB! (linux 64)

Original comment by rak...@gmail.com on 1 Oct 2011 at 8:42

GoogleCodeExporter commented 9 years ago
Yes, as we add support for more datatypes, it naturally multiplies the binary 
size.  The only solutions are separate compilation and linkage and/or runtime 
code generation.  The former is not supported by CUDA yet (will be in the 
future), and the latter is not easy in CUDA yet...

Original comment by harr...@gmail.com on 1 Oct 2011 at 10:25

GoogleCodeExporter commented 9 years ago
If I need only to use cudppCompact - I assume I also need to be keep 
cudppScanDispatch the same? (ie. uncommented out) Because if I comment out the 
*Dispatch() functions in everything but cudppCompact - the library does not 
work (it compiles, but just doesn't work). Would I also need to keep 
reduceDispatch ?

Original comment by rak...@gmail.com on 12 Oct 2011 at 5:03

GoogleCodeExporter commented 9 years ago
Not exactly.  Compact only needs a specific type of scan -- I believe it does a 
forward exclusive sum scan of unsigned integers.  So if you comment the lines 
inside cudppScanDispatch (and the functions it calls) for everything but 
forward, exclusive, operator+, and CUDPP_UINT, then it should work and be a 
much smaller library.

Original comment by harr...@gmail.com on 12 Oct 2011 at 5:34