SciProgCentre / kmath

Kotlin mathematics extensions library
640 stars 55 forks source link

Signal processing + FFT #42

Open altavir opened 5 years ago

altavir commented 5 years ago

As referenced in https://github.com/mipt-npm/KEEP-math/blob/master/use-cases/kiselyov.md

Ixw123 commented 5 years ago

I think that i would really like to work on this, however i would like to be able to integrate my code fairly efficiently and with as little integration problems with the current code base as is possible. If there are any guidelines that you would like me to follow such as documentation, runtime efficiency or anything of that sort please let me know. This goes for the git comment in my commits as well. I have some experience using numpy and scipy, but mostly in respect to keras and using tensorflow as the backend. So if it is alright with you i would like to be able to confer with someone that is better versed in the algorithms needed to be employed, prior to coding up a lot for naught. As i am only an undergrad currently i hope it wouldnt be too bad to ask quite a few questions, thank you.

altavir commented 5 years ago

Very glad to hear it. Please proceed at pace you like, I do not have time to work on that in near future. The general guildlines are following:

For this specific task, It would be good to have a proposal first. Maybe in KEEP-math. We do not want to blindly duplicate numpy approach. Kotlin is much more expressive than python and could allow some very important features like coroutine-based online processing, which means the streaming API. Just few minutes ago I was thinking that something similar could be done to generalize @thomasnield statistics API.

There are some efforts to create a multiplatform wrapper for tensorflow. IT makes sense to investigate it and use as a back-end.

The discussion is quite welcome here as well as in slack

altavir commented 5 years ago

@kapot65 @Zelenyy, you are welcome to contribute to the discussion here and in slack. We can also invite Alexander Kiselyov, I do not know his account name.

Ixw123 commented 5 years ago

Do I need an invite to join that slack? I set up to have one emailed to me but it has my come through yet

On Thu, Jan 31, 2019, 01:54 Alexander Nozik <notifications@github.com wrote:

@kapot65 https://github.com/kapot65 @Zelenyy https://github.com/Zelenyy, you are welcome to contribute to the discussion here and in slack. We can also invite Alexander Kiselyov, I do not know his account name.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/altavir/kmath/issues/42#issuecomment-459236805, or mute the thread https://github.com/notifications/unsubscribe-auth/AONKbtTmOvjpnObi_LSQB4q5WHcDqs56ks5vIpM7gaJpZM4aUQ1M .

altavir commented 5 years ago

You can get an invitation here

Ixw123 commented 5 years ago

In terms of a proposal would you have a good example to start with

On Fri, Feb 1, 2019, 11:13 Alexander Nozik <notifications@github.com wrote:

You can get an invitation here https://surveys.jetbrains.com/s3/kotlin-slack-sign-up

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/altavir/kmath/issues/42#issuecomment-459775962, or mute the thread https://github.com/notifications/unsubscribe-auth/AONKbpLVeota_0Z90_FF5zmLlRon0M4Gks5vJGeRgaJpZM4aUQ1M .

altavir commented 5 years ago

You should start with just documenting the use case. What is the initial data and what do you do with it. I've almost finished the coroutines-based API that probably could be used both for regular array processing (like in python) and streaming processing simultaneously. I hope I will complete the initial design until the end of the week.

Ixw123 commented 5 years ago

Alright I will do a documentation of the artifacts and what is returned as well

On Thu, Feb 7, 2019, 00:29 Alexander Nozik <notifications@github.com wrote:

You should start with just documenting the use case. What is the initial data and what do you do with it. I've almost finished the coroutines-based API that probably could be used both for regular array processing (like in python) and streaming processing simultaneously. I hope I will complete the initial design until the end of the week.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/altavir/kmath/issues/42#issuecomment-461293159, or mute the thread https://github.com/notifications/unsubscribe-auth/AONKbopiLZf3hH3S4VlZIoNTTgZnjP_sks5vK7mcgaJpZM4aUQ1M .

Ixw123 commented 5 years ago

I will make a mock example up and show it to 6ou so you can tell me what to do if you dont mind

On Thu, Feb 7, 2019, 08:41 Micah Church <micah.church.a@gmail.com wrote:

Alright I will do a documentation of the artifacts and what is returned as well

On Thu, Feb 7, 2019, 00:29 Alexander Nozik <notifications@github.com wrote:

You should start with just documenting the use case. What is the initial data and what do you do with it. I've almost finished the coroutines-based API that probably could be used both for regular array processing (like in python) and streaming processing simultaneously. I hope I will complete the initial design until the end of the week.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/altavir/kmath/issues/42#issuecomment-461293159, or mute the thread https://github.com/notifications/unsubscribe-auth/AONKbopiLZf3hH3S4VlZIoNTTgZnjP_sks5vK7mcgaJpZM4aUQ1M .

altavir commented 5 years ago

OK.

Ixw123 commented 5 years ago

Compute 1 dimensional discrete Fourier Transform

Kmath.fft.fft( a, n=null, axis=-1,norm=None)

Please let me know if this works for the one function,also if their is a better way to contact you so i dont have to block up this page with text that isnt too pertinant. Also where i would need to post the final proposal and approach i will take, i know you said KEEP-Math however im uncertain as to where. Thank you for your time

altavir commented 5 years ago

I've fixed the formatting in your post to make it more readable. Some questions though:

Ixw123 commented 5 years ago

I'm about to sleep, I will answer these questions at length tomorrow morning. Thank you very much for the help in the interim.

On Fri, Feb 8, 2019, 01:00 Alexander Nozik <notifications@github.com wrote:

I've fixed the formatting in your post to make it more readable. Some questions though:

  • What is the type of a? If you have axis it is probably many-dimensional, but later you say that it is one-dimensional.
  • Could it be used in streaming mode or you need the whole array at once? Could it be spitted to sub-regions for parallelizatiion?
  • Does it make sense to have different algorithms for this?
  • What is the general workflow for the result? Does it make sense to store the reference to initial array?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/altavir/kmath/issues/42#issuecomment-461700232, or mute the thread https://github.com/notifications/unsubscribe-auth/AONKbpvoh3LkpZdGikZd_1moC2JJlFGIks5vLRKVgaJpZM4aUQ1M .

Ixw123 commented 5 years ago

Okay so for a it is an int, the input array is n dimensional array like. The output is always 1 dimension

So for the first run i was thinking make it arbitrary for the matrix, then later i could have functions that block the matrix into smaller sizes for quicker computations and that could be run with multi threading or processes.

I think one algorithm for this function however it could have different ones for different algorithms used to compute the FFT

I think that having the initial array reference stored may make sense and then that way if could help with error output

On Fri, Feb 8, 2019 at 1:07 AM Micah Church micah.church.a@gmail.com wrote:

I'm about to sleep, I will answer these questions at length tomorrow morning. Thank you very much for the help in the interim.

On Fri, Feb 8, 2019, 01:00 Alexander Nozik <notifications@github.com wrote:

I've fixed the formatting in your post to make it more readable. Some questions though:

  • What is the type of a? If you have axis it is probably many-dimensional, but later you say that it is one-dimensional.
  • Could it be used in streaming mode or you need the whole array at once? Could it be spitted to sub-regions for parallelizatiion?
  • Does it make sense to have different algorithms for this?
  • What is the general workflow for the result? Does it make sense to store the reference to initial array?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/altavir/kmath/issues/42#issuecomment-461700232, or mute the thread https://github.com/notifications/unsubscribe-auth/AONKbpvoh3LkpZdGikZd_1moC2JJlFGIks5vLRKVgaJpZM4aUQ1M .

altavir commented 5 years ago

For now you are mixing a lot of different problems into one. I would suggest to start with simple document describing what you want to do from mathematical point of view (including formulas). Then we can understand how to better implement it. Or you can start with implementation and we can integrate it.

The basic FFT is one-dimensional. Python uses secondary axis for vectorization which is neither requires, nor appreciated in Kotlin. We can do vectorization ourselves with functions or cycles.

There are existing Java implementations like commons-math which could be used for a start, we need only to invent convenient kotlin-ish API for it.

Ixw123 commented 5 years ago

Alright I will do it by like a row at a time and I will have the mathematics describing what I have to do or the psuedo code of it

On Fri, Feb 8, 2019, 11:36 Alexander Nozik <notifications@github.com wrote:

For now you are mixing a lot of different problems into one. I would suggest to start with simple document describing what you want to do from mathematical point of view (including formulas). Then we can understand how to better implement it. Or you can start with implementation and we can integrate it.

The basic FFT is one-dimensional. Python uses secondary axis for vectorization which is neither requires, nor appreciated in Kotlin. We can do vectorization ourselves with functions or cycles.

There are existing Java implementations like commons-math http://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/index.html which could be used for a start, we need only to invent convenient kotlin-ish API for it.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/altavir/kmath/issues/42#issuecomment-461863752, or mute the thread https://github.com/notifications/unsubscribe-auth/AONKbrBoz6ZPOyFeDW0TeY0-donN1o5Dks5vLaepgaJpZM4aUQ1M .

Ixw123 commented 5 years ago

But I will look into commons-math and probably the api for kotlin of that would take precedence then

On Fri, Feb 8, 2019, 11:39 Micah Church <micah.church.a@gmail.com wrote:

Alright I will do it by like a row at a time and I will have the mathematics describing what I have to do or the psuedo code of it

On Fri, Feb 8, 2019, 11:36 Alexander Nozik <notifications@github.com wrote:

For now you are mixing a lot of different problems into one. I would suggest to start with simple document describing what you want to do from mathematical point of view (including formulas). Then we can understand how to better implement it. Or you can start with implementation and we can integrate it.

The basic FFT is one-dimensional. Python uses secondary axis for vectorization which is neither requires, nor appreciated in Kotlin. We can do vectorization ourselves with functions or cycles.

There are existing Java implementations like commons-math http://commons.apache.org/proper/commons-math/javadocs/api-3.6.1/index.html which could be used for a start, we need only to invent convenient kotlin-ish API for it.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/altavir/kmath/issues/42#issuecomment-461863752, or mute the thread https://github.com/notifications/unsubscribe-auth/AONKbrBoz6ZPOyFeDW0TeY0-donN1o5Dks5vLaepgaJpZM4aUQ1M .

altavir commented 5 years ago

The API is rather simple it is just complex array transform. I can very easily add it to kmath-commons module. The actual question is whether we need something more like operation chaining etc.

Ixw123 commented 5 years ago

I think operation training could be used very easily, I'm fairly new to kotlin however so I'm not too sure which would make more sense in terms of the language

On Fri, Feb 8, 2019, 11:45 Alexander Nozik <notifications@github.com wrote:

The API is rather simple it is just complex array transform. I can very easily add it to kmath-commons module. The actual question is whether we need something more like operation chaining etc.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/altavir/kmath/issues/42#issuecomment-461866815, or mute the thread https://github.com/notifications/unsubscribe-auth/AONKbspZxWdP_gGGBn6kIN5oEI_LDBPZks5vLanGgaJpZM4aUQ1M .

Ixw123 commented 5 years ago

I think like the summation with eulers formula stuff would be useful to have as a function, a function that goes over the rows and then it sends the appropriate variables to the euler function. I think once it is implemented to compute the appropriate sltransform functionality can be added afterwards

On Fri, Feb 8, 2019, 12:12 Micah Church <micah.church.a@gmail.com wrote:

I think operation training could be used very easily, I'm fairly new to kotlin however so I'm not too sure which would make more sense in terms of the language

On Fri, Feb 8, 2019, 11:45 Alexander Nozik <notifications@github.com wrote:

The API is rather simple it is just complex array transform. I can very easily add it to kmath-commons module. The actual question is whether we need something more like operation chaining etc.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/altavir/kmath/issues/42#issuecomment-461866815, or mute the thread https://github.com/notifications/unsubscribe-auth/AONKbspZxWdP_gGGBn6kIN5oEI_LDBPZks5vLanGgaJpZM4aUQ1M .

altavir commented 5 years ago

Vectorization in python is required to achieve performance with C back-end. You keep kotlin code much closer to your actual math notation. Anyway, performance optimization is the next step. First we need to understand what to do in math notation.

Ixw123 commented 5 years ago

Alright I will make a mathematical mock up and we can go from there

On Fri, Feb 8, 2019, 12:30 Alexander Nozik <notifications@github.com wrote:

Vectorization in python is required to achieve performance with C back-end. You keep kotlin code much closer to your actual math notation. Anyway, performance optimization is the next step. First we need to understand what to do in math notation.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/altavir/kmath/issues/42#issuecomment-461881761, or mute the thread https://github.com/notifications/unsubscribe-auth/AONKblD8lIqqhRiA1GCCSyq832g4Cwj1ks5vLbRSgaJpZM4aUQ1M .

Ixw123 commented 5 years ago

I want to preface this message with an apology for the radio silence.

Anyways DFT Take x[n] and transform them into X[k]

x[n]: A sequence of complex numbers of length N X[k]: The output of sequence of complex numbers of length N

X[k] is defined as X[k]=sum(n=0, N-1){x[n] e^((-i2pi/N)kn)}

To simplify using eulers formular the definition becomes

X[k]=sum(n=0, N-1){x[n] (cos(2pikn/N)-isin(2pikn/N))}

Let me know if this is clear enough or if I need to explain more, thanks

On Fri, Feb 8, 2019, 12:32 Micah Church <micah.church.a@gmail.com wrote:

Alright I will make a mathematical mock up and we can go from there

On Fri, Feb 8, 2019, 12:30 Alexander Nozik <notifications@github.com wrote:

Vectorization in python is required to achieve performance with C back-end. You keep kotlin code much closer to your actual math notation. Anyway, performance optimization is the next step. First we need to understand what to do in math notation.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/altavir/kmath/issues/42#issuecomment-461881761, or mute the thread https://github.com/notifications/unsubscribe-auth/AONKblD8lIqqhRiA1GCCSyq832g4Cwj1ks5vLbRSgaJpZM4aUQ1M .

altavir commented 5 years ago

OK. Seems good. This specific operation seems to be done on the whole array. It could be done in different ways. We can just write an extension over complex buffer, or we can add a context for such operation. The first solution is preferred if we will ever have only one such transformation and never will need state.

@Zelenyy, is it possible that we will need some kind of state for it? It seems so, especially for some kind of parametric wavelets or whatever. Also, do we also preserve initial buffer size?

Ixw123 commented 5 years ago

I could look into other possible implementations of DFTs if you would prefer and see perhaps which out could computationally be more efficient.

On Fri, Feb 15, 2019, 02:53 Alexander Nozik <notifications@github.com wrote:

OK. Seems good. This specific operation seems to be done on the whole array. It could be done in different ways. We can just write an extension over complex buffer, or we can add a context for such operation. The first solution is preferred if we will ever have only one such transformation and never will need state.

@Zelenyy https://github.com/Zelenyy, is it possible that we will need some kind of state for it? It seems so, especially for some kind of parametric wavelets or whatever. Also, do we also preserve initial buffer size?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/altavir/kmath/issues/42#issuecomment-463942127, or mute the thread https://github.com/notifications/unsubscribe-auth/AONKbjXdj0aP9sxEq5r1lJW2Yxm90354ks5vNmd3gaJpZM4aUQ1M .

altavir commented 5 years ago

I am thinking about design, not implementation. But yes, more examples would be good.

Ixw123 commented 5 years ago

Alright I will look up some more examples of the general design of the DFT algorithm and barring anything there could delve into the INVDFT

On Fri, Feb 15, 2019, 03:26 Alexander Nozik <notifications@github.com wrote:

I am thinking about design, not implementation. But yes, more examples would be good.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/altavir/kmath/issues/42#issuecomment-463950337, or mute the thread https://github.com/notifications/unsubscribe-auth/AONKbpctoQDkoo4WeedLtvgk46-EqO6hks5vNm8lgaJpZM4aUQ1M .

altavir commented 5 years ago

I've added the transformation implementations from commons-math here. The usage is following:

val myComplexBuffer: Buffer<Complex>// take it from somewhere
val res = Transformations.fourier().invoke(myComplexBuffer)

I've also finished streaming support for buffer processing and added helper functions to organized the workflow. Currently, if you have a streaming producer of Complex numbers from somewhere, you can split it into fixed size chunks and process run through fft like this:

val producer: Producer<Complex>
val resultProducer: Producer<Complex> = producer.chunked(200).FFT()

It is not palatalized yet, but could be easily added in the future. For now, the whole cycle is still untested, so I would very grateful if someone would create some test material like file with initial numbers and final expected numbers or some algorithm to check the result.

Ixw123 commented 5 years ago

I will make a script that will test it and compare it with the python output of the equivalent

On Sun, Feb 17, 2019, 07:49 Alexander Nozik <notifications@github.com wrote:

I've added the transformation implementations from commons-math here https://github.com/altavir/kmath/blob/dev/kmath-commons/src/main/kotlin/scientifik/kmath/transform/Transformations.kt. The usage is following:

val myComplexBuffer: Buffer// take it from somewhereval res = Transformations.fourier().invoke(myComplexBuffer)

I've also finished streaming support for buffer processing and added helper functions to organized the workflow. Currently, if you have a streaming producer of Complex numbers from somewhere, you can split it into fixed size chunks and process run through fft like this:

val producer: Producerval resultProducer: Producer = producer.chunked(200).FFT()

It is not palatalized yet, but could be easily added in the future. For now, the whole cycle is still untested, so I would very grateful if someone would create some test material like file with initial numbers and final expected numbers or some algorithm to check the result.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/altavir/kmath/issues/42#issuecomment-464453111, or mute the thread https://github.com/notifications/unsubscribe-auth/AONKbinIElDYtXjR5l4y5m10OovcLFl5ks5vOU_2gaJpZM4aUQ1M .

Ixw123 commented 5 years ago

Please disregard the previous message the function I used to attempt to validate the output was the wrong function