Feature extractor run time error

mikailweston commented 3 years ago

On extracting features, I occasionally get this error. I am not sure which files it is referring to :

`Extracting features for animal 1238529_39_40 Building File Tree... Building File Tree... Building File Tree... Building File Tree... Building File Tree... Building File Tree...

:1: RuntimeWarning: divide by zero encountered in log C:\Users\mweston\Documents\GitHub\pyecog2\pyecog2\feature_extractor.py:16: RuntimeWarning: divide by zero encountered in log return np.log(np.mean(np.abs(fdata[int(len(fdata)*band[0]/fs):int(len(fdata)*band[1]/fs)]))) # todo consider making this with proper units`

mikailweston commented 3 years ago

Second feature extractor error:

Rat IVC picks up signals of all sorts of transmitters or incorrectly interprets the TID. In this rig I have created a project file with animal with TIDS [137,138] which started recording 1 week after another batch of animals in the same project and rig.

When NDF conversion happened, incorrect data was interpreted by LWDAQ as TID 138 only for the first 184 files which happended to match the TIDS of an acutal transmitter with TIDS [137,138] in another 1500 files.

When I attempted to do feature extraction multiprocessing failed as it was looking for TID 137, which did not exist in the first files of the folder. Is there a way of alerting the user to the presence of too few TIDs being detected and omit these files for NDF conversion, or managing the feature extractor exception instead.

Extracting features for animal 1242137_137_138
Building File Tree...
Building File Tree...
Building File Tree...
Building File Tree...
Building File Tree...
Building File Tree...
Building File Tree...
Building File Tree...
Building File Tree...
Building File Tree...
Building File Tree...
Building File Tree...
Building File Tree...
Building File Tree...
Building File Tree...
Building File Tree...
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Users\mweston\Anaconda3\envs\pyecog2\lib\multiprocessing\pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "C:\Users\mweston\Documents\GitHub\pyecog2\pyecog2\feature_extractor.py", line 154, in extract_features_from_file
    self.extract_features_from_time_range(file_buffer, time_range, feature_fname, feature_metafname,animal_id)
  File "C:\Users\mweston\Documents\GitHub\pyecog2\pyecog2\feature_extractor.py", line 177, in extract_features_from_time_range
    features[i,j] = func(data)
  File "<string>", line 1, in <lambda>
IndexError: index 1 is out of bounds for axis 1 with size 1
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\mweston\Documents\GitHub\pyecog2\pyecog2\coding_tests\WaveletWidget.py", line 147, in run
    result = self.fn(*self.args, **self.kwargs)
  File "C:\Users\mweston\Documents\GitHub\pyecog2\pyecog2\coding_tests\FeatureExtractorGUI.py", line 230, in extractFeatures
    self.feature_extractor.extract_features_from_animal(animal, re_write = self.re_write.isChecked(), n_cores = -1,
  File "C:\Users\mweston\Documents\GitHub\pyecog2\pyecog2\feature_extractor.py", line 134, in extract_features_from_animal
    for i, _ in enumerate(pool.imap(self.extract_features_from_file, tuples)):
  File "C:\Users\mweston\Anaconda3\envs\pyecog2\lib\multiprocessing\pool.py", line 868, in next
    raise value
IndexError: index 1 is out of bounds for axis 1 with size 1

mfpleite commented 3 years ago

np.mean(np.abs(fdata[int(len(fdata)band[0]/fs):int(len(fdata)band[1]/fs)]))

These look just like warnings when trying to compute log(0), are all the feature files created nicely? The signal is probably flatlined, but I thought I was introducing one bit of noise to avoid this sort of stuff, I'll double check it. Anyway the NaNs in the features should be handled gracefully at a later stage anyway, if I remember correctly.

mfpleite commented 3 years ago

When NDF conversion happened, incorrect data was interpreted by LWDAQ as TID 138 only for the first 184 files which happended to match the TIDS of an acutal transmitter with TIDS [137,138] in another 1500 files.

Yeah, this is bound to happen with the OSI transmitters, I think there is already a safeguard to accept files with a minimum number of samples, I have to double check that forcing the sampling rate and TIDs does not override this safeguard. Anyway, a quick fix for you is to just manually delete those early files, or do the NDF conversion with the precise start and end dates of the experiment for the animals you are converting

mfpleite commented 3 years ago

I have to double check that forcing the sampling rate and TIDs does not override this safeguard

Ok so I double checked and the minimum number of samples to consider a TID present in a file is 20K samples, which is fairly high already... I am reticent to increase it, because otherwise very short files for acute experiments will have go through a lot more hassle to convert.

mikailweston commented 3 years ago

Ok, I think I will just have to delete the unwanted files.

The channel 138 data in the early period is all unwanted junk, but the project file is set up for all the animals in the whole group which start at different times.

I think overall it will save me more time to delete unwanted single channel junk data after choosing a wide time window to convert, as I want only one project file per rig.

Would it be possible to add an option in the code to only convert the NDF file if both selected TIDs are present?

On Thu, 17 Jun 2021 at 10:28, mfpleite @.***> wrote:

I have to double check that forcing the sampling rate and TIDs does not override this safeguard

Ok so I double checked and the minimum number of samples to consider a TID present in a file is 20K samples, which is fairly high already... I am reticent to increase it, because otherwise very short files for acute experiments will have go through a lot more hassle to convert.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/KullmannLab/pyecog2/issues/27#issuecomment-863084269, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB342OOSZWOLI6OVOQLBP6TTTG523ANCNFSM46ZYRFGA .

mfpleite commented 3 years ago

Sorry - didn't realize this was unanswered:

Would it be possible to add an option in the code to only convert the NDF file if both selected TIDs are present?

I don't think this is a good approach because the NDF converter code might be used in old scripts with transmitters for multiple animals, and there is a simple workaround here.

KullmannLab / pyecog2

Feature extractor run time error #27