bgoutorbe / seismic-noise-tomography

Python framework for seismic noise tomography
166 stars 89 forks source link

Optimisation. #17

Open boland1992 opened 9 years ago

boland1992 commented 9 years ago

Just wanted to say thank you very much for your code. It's been an invaluable tool for ANT modelling in the region for which I'm am doing my masters thesis in.

I just have a question on optimisation. Currently, the longest process (particularly for large 1TB+ data sets) is stacking the cross-correlations. Would there be any use utilising the python multiprocessing module to parallelise some of the code in crosscorrelation.py? would this increase speed and utilise more of the hardware on say larger server computers?

What other optimisation steps could be taken to increase the speed of this code? any suggestions would help.

Thank you for your time. If you would like any assistance in testing the code or for anything else, my email is boland1992@gmail.com.

Cheers, Benjamin Boland

bgoutorbe commented 9 years ago

Hi Benjamin,

thanks, I'm happy the code proved useful to other people. If you don't mind, could you send me your thesis or publication(s) when you're done with your work?

Your idea of parallelising the code is excellent. The pre-processing steps (instrument response, time-normalization, spectral whitening etc.) could be easily parallelised, with one process per daily seismic waveform. But this would easier to implement by first gathering all these pre-processing steps into a single module function, say, pscrosscorr.preprocess(stream) (something I've long wanted to do, in order to make crosscorrelation.py more compact). Also, in the next step (calculation of cross-correlations), we could set up one process per pair of stations.

The problem with multiprocessing if that we cannot change the state of objects, but I guess this can be taken care of with a few tricks.

I'll have a look a that when I have some time.

Cheers Bruno

boland1992 commented 9 years ago

Hi Bruno,

If you send me your email address, then I'll be happy to send you my results once my thesis and/or publications are finished. I am using your code for a big chunk of my results afterall! again my address is boland1992@gmail.com

Thanks for your help on this. I'll let you know if I make any progress with multiprocessing and let me know any tricks you can find!

Cheers, Benjamin

bgoutorbe commented 9 years ago

Hi Benjamin,

my email is: goutorbe@hotmail.com

I just updated the code to move all the pre-processing steps into pscrosscorr.preprocessed_trace(). I haven't tested it thoroughly so if I left a bug somewhere please report it. Note that in the process I got rid of a few legacy parameters, that were related to spectrum analysis and did not really have their place in the cross-correlation script.

Cheers Bruno