GPU accelerated Radiance studies

ladybug-tools / honeybee-dynamo

:honeybee: :blue_book: Honeybee library and plugin for Dynamo

GNU General Public License v3.0

17 stars 8 forks source link

GPU accelerated Radiance studies #40

Closed TheodoreGalanos closed 6 years ago

TheodoreGalanos commented 7 years ago

Hello everyone,

I had this idea through the Machine Learning I'm doing recently. It seems that 3-phase radiance is basically a series of matrix multiplications. This is exactly what has brought gaming CPUs to every AI researcher's computer.

I was wondering if we could look into making it possible through HB+. I've only looked slightly on the web and found this: https://buildings.lbl.gov/sites/all/files/2011-ibpsa-radiance.pdf. They seem to be using OpenCL to do it with the code, at least in the paper, being a few lines creating an matrix multiplication object that gets passed onto the GPU.

My idea of how to do it might sound a bit crazy but we could actually use a neural network library to test this out! We'd need to transform the matrices to tensors and then use Tensorflow, pyTorch, Theano, etc., to pass it to the GPU where the multiplications can happen. Passing a matrix and doing simple multiplications is a very easy thing to do in these libraries, maybe done in a few lines.

I'm only just starting to test the new 3-phase method so if someone has a way to separately extract the Mv, Mt, Md, matrices to be multiplied I can test this out on my side.

Just a shot in the dark!

Kind regards, Theodore.

sariths commented 7 years ago

Hi @TheodoreGalanos , have you tried Accelerad? On the Three Phase front, this example file that we used at the NYC hackathon has matrices as separate entities.

After running 100's of phase-based calculations that include Three-Phase, Five-Phase and now F-Matrix (a.k.a 4 phase/6 phase) over the past couple of years, I'd say that matrix calculations represent less than 5% of the total simulation runtime in any calculation. Majority of the time is spent in ray tracing that involves shadow testing, virtual source calcs, and ambient calcs. The results from these ray trace calculations are stored in the V,T,D,F matrices.

GPU calculations have intrigued me for a long time. But with all that I have going on with my research, dissertation etc.. I have hardly ever found time to get too involved in them. To quote Hippocrates from Peter Norvig's timeless essay.. "Life is short, [the] craft long, opportunity fleeting, experiment treacherous, judgment difficult." ..!

Coincidentally, Mostapha and I were discussing multi-threading earlier today :)

Sarith

TheodoreGalanos commented 7 years ago

Hi @sariths Thank you for your response, I can now understand a bit more.

It's sad to hear it's only 5% of the simulation run time, it seemed while reading the method that it might have been more important. Is this also in the case when we want to create animations over the whole year? I have to admit I'm not that familiar to the method, apart from reading the tutorials about it.

However, it seems Accelerad might be already the implemented solution! It looks amazing! And it will probably play out-of-the-box in my Linux set up where cuda 8.0 is already there! It seems that I can try pointing to the accelerad components of rpict and rtrace, even within HB!

It seems that it wouldn't be that hard to allow for an accelerad option within HB+ right? Something like a switch that would point to those components instead of the normal Radiance ones.

Anyways, I'll give all this a try tomorrow and report!

Kind regards, Theodore.

sariths commented 7 years ago

Hi @TheodoreGalanos , This..

Is this also in the case when we want to create animations over the whole year?

..is one instance where speeding up matrix multiplications will indeed be beneficial. Dctimestep, the tool used that paper is an optimized matrix multiplication tool (other matrix tools being rcollate and rmtxop). In the case of imaging it is likely to be more useful than illuminance based calculations (which is what my previous comment was based on).

I don't see 3 Phase animations of being any particular use as the view after the T Matrix is obscured by hemispherical sampling on Klems patches (as shown below).

screenshot from 2017-03-05 02 57 49

However, if one were to do matrix-based optimizations like the ones you mentioned on Daylight coefficient calcs, then THAT will be really useful. Below is a daylight coefficients calculation for the same space as that shown in the previous image.. You can actually see clearly beyond the glazing in the image below. This image took around 30 odd minutes on a Unix cluster (28 for ray trace and 2 min for dctimestep).

screenshot from 2017-03-05 02 59 06

Things get interesting in the images below as raytrace was only performed once and rest were matrix calcs .i.e (28 min + 2 min x 8). So, if the matrix calculation were to be sped up in this case we'd actually save a lot of time. screenshot from 2017-03-05 02 58 22

To sum up, what they are saying in that paper is conditionally correct!

sariths commented 7 years ago

By the way, here are the results from a Three Phase illuminance simulation that I just ran to test how long different parts of the calculation took. The timestamps are in bold

Start time : 02:12:04 Time for octree : 02:12:04 rfluxmtx: running: rcontrib -fo+ -ab 4 -ad 50000 -lw 2e-5 -n 8 -I+ -y 100 -faa -c 1 -f klems_full.cal -p RHS=+1 -bn Nkbins -b 'kbin(0,1,0,0,0,1)' -m Glazing '!oconv -f -i room3ph.oct objects/GlazingVmtx.rad' Time for vmtx : 02:12:18

rfluxmtx: opening pipe to: rcontrib -fo+ -ab 4 -ad 10000 -lw 1e-5 -n 8 -fdf -c 1000 -bn 1 -b 'if(-Dx*0-Dy*0-Dz*1,0,-1)' -m groundglow -f reinhartb.cal -p MF=1,rNx=0,rNy=0,rNz=-1,Ux=0,Uy=1,Uz=0,RHS=+1 -bn Nrbins -b rbin -m skyglow -y 145 '!oconv -f -i room3ph.oct skyDomes/skyglow.rad'
rfluxmtx: sampling 145 directions

Time for dmtx : 02:56:06

place New York City/Central_USA
latitude 40.78
longitude 73.96
time_zone 75.00
site_elevation 40.0
weather_data_file_units 1

Time for sky : 02:56:07 Time for matrix multiplication, dctimestep : 02:56:13

As you see above, the final matrix multiplication step above only took 7 seconds out of the total 44 minutes and 9 seconds that it took for the full simulation to run.

TheodoreGalanos commented 7 years ago

@sariths I believe that is exactly what I meant! As I said I've yet to delve into this, so what got my attention was what Mostapha briefly mentioned in his presentation during the AEC hackathon. I believe when he mentioned that we now can calculate the same scene for different times he probably meant the daylight coefficient method you are referring to.

What I had in mind is creating a workflow that computes and renders image based daylight studies for the hours of the day that sunlight matters. The animation idea would probably require high temporal granularity and I kind of meant it as a sequence of images that portrays light variability during the day.

But I'm still not sure how to output the matrix multiplications from each of the timesteps in your last example. I will refrain from (additional) guessing and will try to delve into this tomorrow.

Once again thanks for your comments and insights!

Kind regards, Theodore.

sariths commented 7 years ago

@TheodoreGalanos ,

What I had in mind is creating a workflow that computes and renders image based daylight studies for the hours of the day that sunlight matters. The animation idea would probably require high temporal granularity and I kind of meant it as a sequence of images that portrays light variability during the day

I agree, here is a demo with Three Phase .. https://www.youtube.com/watch?v=2rXrWhIf-4o ..not recipe-fied in Honeybee yet, but all the backend functionality is already there.

TheodoreGalanos commented 7 years ago

@sariths Yes that is exactly what I am looking for! Does it make any sense for me to ask you for the radiance code for this? Or is it just easier to wait for a HB+ definition to do it in there?

I'm already drifting into cool statistical analysis we can do with such granularity. With the right intervals maybe we can train some regression algorithms to fill out the gap, in any case it would be really interesting! (although I'm not certain this would be more helpful than the actual annual analysis results which is probably the same?)

Kind regards, Theodore.

sariths commented 7 years ago

Hi @TheodoreGalanos , that video is 60 % Radiance + 30 % Python + 10 % Unix ... which will sometime in future be (hopefully) 99 % HB.

Here is roughly what is I did:

Took a normal EPW file for NYC and then filled in values for Direct-Normal and Diffuse-Horizontal radiations for minute-based timesteps (to add granularity) through a combination of regression and interpolation.
Created a custom WEA file based on this.
Created a annual matrix based on that WEA file through gendaymtx .
The ray-trace part I used was for F-Matrix calculations.. But the ray-trace element can be based on Three Phase, Five Phase or (preferably) Daylight Ceofficients.
Then I created time-step images based on the annual matrix through dctimestep (which is the program they are optimizng in the paper that you mentioned in your original post).
Looped through the images using a python subroutine and converted them into psuedo-color (sans the legend) using falsecolor.
Finally I combined the HDR images using pcompos , converted it to tiff and then merged them together....

(This entire process was a lot more organic than what my description seems to imply. )

Step 1 isn't actually the right way to go about as there are some better approaches that have already been identified in this paper by Walkenhorst et al. (I was too lazy to do that as I just tried this out as a proof-of-concept).

@mostaphaRoudsari and I began writing the Radiance-based source-code for HB [+] to facilitate these kinds of workflows. For example, all the recipes that are now included in HB[+] are built from ground-up using python-based wrappers for core Radiance programs.

So, it will be possible to do so such stuff in HB[+] some time in the near future.

TheodoreGalanos commented 7 years ago

Hi @sariths

Thanks for sharing this. This sounds really exciting I think. I think in the very near future we can integrate this workflow to all real cases.

I think I can take a look on different options for the first step. It sounds as if something that Machine Learning can help with. Perhaps if we train an LSTM on the time series data, we can use the predicted curve to fill in the 1-min intervals. Also, the paper you linked seems quite helpful, I will take a look at it.

In the meantime I'll start working with HB+ today. I should go through the examples and the current functionality to at least get a grip of the method. Then I can see if I have the potential to contribute in this.

Edit: an interesting area I will look into is super-resolution. It is mainly used (with incredible results) to enhance image quality but it has also been used to enhance time series resolutions.

Once again, thanks for the insight. I appreciate it!

Kind regards, Theodore.

mostaphaRoudsari commented 6 years ago

This is now possible with Accelerad! I'm closing this issue as the main topic has been addressed.