ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.08k stars 1.19k forks source link

Synthetic Time Series Data #974

Open DLWCMD opened 3 years ago

DLWCMD commented 3 years ago

Is your feature request related to a problem? Please describe. Uber/Ludwig wishes to enhance its support for time series modeling and analysis. While there are many existing time series available for downloading, such as the LA Weather Archive used in a Ludwig TS example, a 'self-contained' Ludwig application would offer many benefits as a test bed for TS analysis with Ludwig, including comparison of various tools and methods.

Describe the use case A clear and concise description of what the use case for this feature is. Incorporating user-configurable time series into Ludwig integrated with its various TS tools and methods, such as sequencing.

Describe the solution you'd like I have developed Python code, based on the Mackey-Glass equation, which supports creating simulated (AKA 'synthetic') time series of any length and configuration (e.g., univariate, multivariate with user-defined prediction horizons). MG time series are designed to be chaotic and thus present a challenging test-bed for TS analysis and prediction and so should support creation of robust solutions. I would be happy to contribute my code and/or support Uber staff to incorporate it into Ludwig.

Describe alternatives you've considered Integrate Ludwig with the University of California-Irving Machine Learning Repository, which includes over 100 time series collected from around the world.

Additional context The proposed application could also be used to support development and testing of new Ludwig TS tools and methods.

w4nderlust commented 3 years ago

Thanks @DLWCMD for this feature request. At the moment when we generate synthetic datasets for timeseries we just use random values (as the intent is just to test if things work end to end rather than testing performance) amd what you are suggesting sounds like a nice improvement! Would you have any interest in helping out implementing this? We can provide guidance in case ;)

DLWCMD commented 3 years ago

Good to hear from you, and yes, I would like to contribute. I have been working on understanding Mackey-Glass (MG) equation and generating sequences and time series from them. One of the benefits of using these equations is that the degree of “chaos,” that is the challenge of modeling and predicting the associated TS, can be varied by changing the equation parameters.

I have some examples that suggest how MG might be used, which I would be happy to share with you.

Thanks again for your response. I will say I am much impressed with Ludwig, which I was touting to my son, who works with ML/AI for Johns Hopkins Applied Physics Lab. I like the range of functionality, from no coding for basic situations, to extending through use of Python features.

Talk soon.

David L. Wilt

3272 Bayou Road

Longboat Key FL 34228

dwilt1947@gmail.com

540-420-0844

From: Piero Molino notifications@github.com Reply-To: uber/ludwig reply@reply.github.com Date: Monday, November 2, 2020 at 9:06 PM To: uber/ludwig ludwig@noreply.github.com Cc: DLWCMD dwilt1947@gmail.com, Mention mention@noreply.github.com Subject: Re: [uber/ludwig] Synthetic Time Series Data (#974)

Thanks @DLWCMD for this feature request. At the moment when we generate synthetic datasets for timeseries we just use random values (as the intent is just to test if things work end to end rather than testing performance) amd what you are suggesting sounds like a nice improvement! Would you have any interest in helping out implementing this? We can provide guidance in case ;)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.