Closed henriquepaes1 closed 3 weeks ago
Thanks for opening your first issue here! Be sure to follow the issue template!
@henriquepaes1: This issue is currently awaiting triage.
The triage/accepted label can be added by org members by writing /triage accepted in a comment.
@henriquepaes1: There are no 'kind' label on this PR. You need a 'kind' label to generate the release note automatically.
/kind feature
/kind bug
/kind question
@henriquepaes1: There are no area labels on this issue. Adding an appropriate label will greatly expedite the process for us. You can add as many area as you see fit. If you are unsure what to do you can ignore this!
/area ui-ux
/area semantics
/area translation
/area security
Hi @henriquepaes1 , I am extremely sorry for the confusion. Are you referring to the example in the documentation? I would try to explain it here. So, first let's see the functions used to simulate the Rossler attractor.
def rossler(t, X, a, b, c):
"""The Rössler equations."""
x, y, z = X
xp = -y - z
yp = x + a*y
zp = b + z*(x - c)
return xp, yp, zp
def rossler_time_series(tmax,n, Xi, a, b, c):
x, y, z = Xi
X=[x]
Y=[y]
Z=[z]
dt=0.0001
for i in range(1, tmax):
#print('Xi:', Xi)
x, y, z = Xi
xp, yp, zp=rossler(t, Xi, a, b, c)
x_next=x+dt*xp
y_next=y+dt*yp
z_next=z+dt*zp
X.append(x_next)
Y.append(y_next)
Z.append(z_next)
Xi=(x+dt*xp,y+dt*yp,z+dt*zp)
X=np.array(X)
Y=np.array(Y)
Z=np.array(Z)
step=int(tmax/n)
indices = np.arange(0,tmax, step)
#print('IS NAN X:',np.sum(np.isnan(X[indices])))
#print('IS NAN Y:',np.sum(np.isnan(Y[indices])))
#print('IS NAN Z:',np.sum(np.isnan(Z[indices])))
return X[indices], Y[indices], Z[indices]
This one gives the time series using the differential equation. As long as the delta t value is sufficiently small this function can approximate the system. Now, Rossler attractor has three parameters, a, b, and c. If we fix b to 0.5 and c to 5.7, above a = 2 the system will be chaotic and below a = 2 it will be periodic. So, we will set a set of values for that. And in the example, we are also artificially varying SNR by adding a Gaussian noise.
b = 0.2
c = 5.7
a_array = [0.1,0.15,0.2,0.25,0.3]
SNR = [0.125,0.25,0.5,0.75,1.0,1.25,1.5,1.75,2.0]
Now, we loop over different values of SNR, a and we repeat the process for each combination of SNR and a values for n times, with varying time series length, and initial points. So, the values of SNR and a can repeat. And for keeping everything same for combinations of SNR and a, we set the same random seed.
for snr in tqdm(SNR): # Looping over SNR values
for j in tqdm(range(len(a_array)): # Looping over values of parameter a
a=a_array[j]
random.seed(1)
np.random.seed(seed=301)
rp_sizes = random.sample(range(150, 251), 10) # selecting a length for time series
for k in tqdm(range(5)): # Repeats
u0, v0, w0 = 0+(1*np.random.randn()), 0+(1*np.random.randn()), 10+(1*np.random.randn()) # Setting initial conditions
for k2 in tqdm(range(len(rp_sizes))): # Looping over different RP sizes(time series length)
tmax, n = int(1000000*(rp_sizes[k2]/250)), rp_sizes[k2]
print('started model')
Xi=(u0, v0, w0)
t = np.linspace(0, tmax, n)
x, y, z=rossler_time_series(tmax,n, Xi, a, b, c)
x=add_noise(x, snr) # Adding noise
y=add_noise(y, snr)
z=add_noise(z, snr)
u[:,0]=x # Defining the output matrix
u[:,1]=y
u[:,2]=z
np.save('/user/swarag/Rossler/signals/('+str(snr)+','+str(a)+','+str(u0)+','+str(v0)+','+str(w0)+','+str(rp_sizes[k2])+')~.npy',u) # Save the data to a numpy file
Then we extract features from these files
input_path = '/user/swarag/Rossler/signals' # directory to which the signals are saved
RP_dir = '/user/swarag/Rossler/RP' # directory to which we want to save the RPs
RP_computer(input_path, RP_dir)
Dict_RPs=windowed_RP(68, 'RP') # Specifying window size and folder in which RPs are saved
First_middle_last_sliding_windows_all_vars(Dict_RPs,'Rossler_data.csv')
And here, if you are talking about this csv file, this will have 4 rows for the same RP, because, the function estimates RQA features from whole RP, mean, median and mode of the sliding windows.
But, for any purposes, we need to code variables that denotes the behaviour of the system. By default all of these will be stored in a column named "group"
data = pd.read_csv('Rossler_data.csv')
FILE = np.array(data['group']) # In the output data, the field named 'group' will have file name which contains details
SNR =[]
A =[]
U0 =[]
V0 =[]
W0 =[]
SIZE =[]
for FI in FILE:
info = ast.literal_eval(FI)
SNR.append(info[0])
A.append(info[1])
U0.append(info[2])
V0.append(info[3])
W0.append(info[4])
SIZE.append(info[5])
data['snr'] = SNR
data['a'] = A
data['u0'] = U0
data['v0'] = V0
data['w0'] = W0
data['length'] = SIZE
A = np.array(A)
SYNCH = 1*(A>0.2) # Defining synchrony condition, parameter value belonging to the chaotic region
data['synch'] = SYNCH
So, it is possible to have repeating columns. Kindly reply if this is not answering your query. Because SNR, a and initiation points (u0, v0, w0) can repeat in this code, and this values can have n number of RP sizes. Further, when you extract, it will be one whole RP estimate + four windowed estimate. Depending on what feature you are interested in and which SNR, you will have to subset the data. The repeating nature is there to have similar set of condition for combination of some variable. Hope this answer would be satisfactory. Thank you so much!
I appreciate your prompt reply. I'm still trying to understand the script output. For computation speed purposes, I fixed a and SNR values and generated the following signals directory:
Then, I called the functions to apply RQA analysis:
maker.RP_computer(input_path, RP_dir)
Dict_RPs=sd.windowed_RP(68, RP_dir, base_directory + "\\Rossler\\pickle")
metrics = ["recc_rate", "percent_det", "avg_diag", "max_diag", "percent_lam", "avg_vert", "vert_ent", "diag_ent", "vert_max"]
sd.First_middle_last_sliding_windows_all_vars(Dict_RPs, base_directory + "\\Rossler\\Rossler_data.csv", ["median"], metrics)
And the resulting csv was:
I'd like to understand the meaning of each line better. Is it the separated RQA for each one of the groups?
I don't know if I misinterpreted something with the technique. I was expecting only one resulting line as if we applied RQA considering all the signals in the directory as a single input and generated a combined RQA for all those signals.
Yes, ignore the first column, which is label. "window" is basically the estimate you are looking at (median = median of sliding window estimates, mean = mean of sliding window estimates, mode = model of sliding window estimates, whole = from whole RP), "group" is the name of the RP file ( which is a numpy file ) and rest are RQA variables
So it applies the sliding window individually for each numpy file in the specified directory?
Yes, it applies the sliding window of specified length to each of the RP files saved
Also, currently the step size between sliding windows is 1, will update that to take a custom step size
I understand. Is there a way to generate a single RP for multiple signals? For example, in this Rossler Attractor example, would it be possible to generate a single RP for all the signals in the directory?
RPs are created for a time series. It shows the recurring instances in the time series by one and the rest by 0. If your signals are time-locked, it is possible to run a multidimensional RQA. Because each time series represents an individual system and its RQA variable, it depends on your time series. But, if you ask about combined RP for multiple time series from different processes, that won't be possible as the RP won't capture the recurring characteristics adequately. And even with multidimensional RQA, computation might be tricky when the number of dimensions is above some values. However, you can still get some trends from the RQA variables estimated. If discussing an interactive task or something similar, consider checking about epistemic network analysis. Still, you will also be looking at a space with some features. Hope this helps
Fine! Thank you so much for the clarification
I hope you're happy with this clarification. Closing this issue
I tried to replicate the Rossler Attractor example. I wasn't understanding why my example generated 10 signal files and only one corresponding line in the output csv. I later found out this snippet in the windowed_RP function:
If you take a look at the signals generated for the problem, you find the following:
Due to the usage of float variables in the name generation, as stated in the snippet below:
You can clearly infer that all the data is being saved in a single dict key in the windowed_RP execution, and the resulting csv has only one group, therefore, only one line, as the image below shows (notice the presence of a single group):
I'm trying to understand better the SMdRQA technique and its application. Can someone explain me what is the expected csv output for the Rossler Attractor problem when the files are named correctly?