Open MartinSchobben opened 1 year ago
Hi Martin,
That looks more than promising. Yeah, I was always aware of that bottleneck, but never found the time to solve it. So I am more than thrilled to implement your solution. I hope to do that soon. If the implementation is urgent for you, you can also file a PR or let me know and I try to hurry up a little bit.
Best, Mirko
Hi Mirko,
There is no real hurry. I was peeling apart your code as an example for an application meant for internal usage at the TU Wien. I like to have variograms for ASCAT soil moisture data. But there I need some more performance, so I was profiling the code and looking to build in some memoizing functionality for e.g. methods that contain expensive calculations. That is why I came across this and I thought I let you know.
Perhaps I can show you this code some time, also to check if credits are due somewhere for if it should go public.
I would also be happy to try and turn the above into a PR.
Kind regards, Martin
Yeah, nice. Then I guess the best way to proceed is that you open a PR with your code, which also makes you a collaborator (and co-author of the Zenodo publication). That way credit is attributed clearly and transparently. I can also assist with the PR, if you feel like I can be helpful.
Yes, and I should list you probably as contributor (or cite your work) on my ASCAT application if it would go public. As I am heavily relying on your knowledge of spatial autocorrelation.
Yes, and I should list you probably as contributor (or cite your work) on my ASCAT application if it would go public. As I am heavily relying on your knowledge of spatial autocorrelation.
Just to let you know: There is also a model description paper on GMD, which you can cite if you use the package, DOI: 10.5194/gmd-15-2505-2022 URL: https://gmd.copernicus.org/articles/15/2505/2022/gmd-15-2505-2022.html
I ran into this problem while dealing with a large amount of data. I wonder whether replacing the
for
loop inVariogram. _calc_groups()
. withnumpy.digitize()
might be an option. Not sure if the output is the desired format but it would surely be fast. See below for more details and some benchmark output:Output: