catniplab / NPGLM

non-parametric inference for convolutional filters in poisson glms
MIT License
13 stars 1 forks source link

Could you explain how to prepare data (*.pkl)? #1

Open ys7yoo opened 3 years ago

ys7yoo commented 3 years ago

Dear authors,

The intriguing paper (Non-parametric generalized linear model) from your group led me to this repo. I'm eager to learn from your paper and code.

However, I have difficulty in running the demo code (Demo/demo.py). The error is like the following:

Traceback (most recent call last):
  File "/Users/yyoo/src/npglm/Demo/demo.py", line 30, in <module>
    main()
  File "/Users/yyoo/src/npglm/Demo/demo.py", line 19, in main
    gp.initialize_design_matrices()
  File "/Users/yyoo/src/npglm/GLM/GLM_Model/Model_Runner.py", line 54, in initialize_design_matrices
    self.data_df = pd.read_pickle(self.params.expt_problem_data_path)
  File "/Users/yyoo/opt/anaconda3/envs/torch/lib/python3.6/site-packages/pandas/io/pickle.py", line 170, in read_pickle
    f, fh = get_handle(fp_or_buf, "rb", compression=compression, is_text=False)
  File "/Users/yyoo/opt/anaconda3/envs/torch/lib/python3.6/site-packages/pandas/io/common.py", line 434, in get_handle
    f = open(path_or_buf, mode)
FileNotFoundError: [Errno 2] No such file or directory: 'PreProcessing/generated_data/filter_set_a/data_df.pkl'

Process finished with exit code 1

It seems that I should've prepared a data in a pickle file. But, I'm not sure where to start from.

Memo of [Compatible Data Format] in the README.md says I should prepare a pandas Datafro

Could you kindly provide a guideline (or example code) for preparing a dataset? A quick example to produce the toy simulations in the paper would be great. Also, if possible, a sample file used for the toy simulation in the paper would be greatly appreciated.

Thank you very much! Yongseok

matthew-dowling commented 3 years ago

Hi Yongseok,

Thanks so much for alerting me to this. I had overlooked uploading of the toy data but have since updated the repo. In addition to that, synthetic_demo_one.py is an updated version of what i had intended the demo to be as such i have removed demo.py in favor of that one.

For the data to be compatible it should be in a pandas dataframe object saved using the .pkl format. The index of the dataframe should include relevant covariates -- and in addition to that please be sure one of them, containing your spiking data, is labeled 'History' while the 'data' column of this dataframe should contain a (number of trials x trial length) numpy matrix.

Here's an example of the pandas dataframe for toy problem which included 100 trials over 2500 time bins (2.5s)

image

I greatly appreciate your raising of the issue and use of the code, so please let me know if you have any further issues or questions! Some aspects may seem very non-streamlined, so I am hoping to in the future make it much more intuitive/easy to use.

Best, Matt