OxfordSchmidtAIFellows / turbo-parakeet

MIT License
1 stars 0 forks source link

Generalised data processing #12

Open hollypacey opened 1 year ago

hollypacey commented 1 year ago
  1. clean out bad recordings:
    • if for (stim = sucr; conc = 100) the bins are all 0
    • then for every sensillum tested with that stimulus combination (i.e., each matching BeeID / sensillum)
    • remove the data
    • save to a new 'cleaned' dataset atm so original isn't lost.
  2. rebin a current dataset:
    • determine the current binning
    • what is the new binning wanted?
    • check if doable - ie a multiple of prev. bins
    • rebin and save to a new 'rebinned' dataset so original isn't lost.
  3. try to automate dataset characteristics more and set metadata/timeseries
    • also can we note the time duration and bin sizes?
    • use time values rather than time indices (redundant as accessible as list indices already?)
    • make data read in more complex.
    • add class functions to access and print properties of the dataset.
  4. filtering
    • functions to inform you about the metadata e.g. what types of sugar are available the dataset
    • functions to filter out a new dataset based on specific characteristics e.g. keep all the data from fructose and put it into a new dataset. [and then you could have a few different subsets of the data to compare in plots or input to clustering algs.]