gslab-econ / gslab_r

5 stars 1 forks source link

PR for #14: Create binned scatter function #22

Closed ew487 closed 2 years ago

ew487 commented 2 years ago

Closes #14

Note: confidence intervals still in progress.

ew487 commented 2 years ago

Hi @jmshapir, I'm having trouble adding @torenfronsdal as a reviewer, do you know how to fix this?

jmshapir commented 2 years ago

Hi @jmshapir, I'm having trouble adding @torenfronsdal as a reviewer, do you know how to fix this?

I added @torenfronsdal to the repository and was able to add @torenfronsdal as a reviewer.

(For now @torenfronsdal has read access; we can elevate to write if/as needed.)

ew487 commented 2 years ago

I added @torenfronsdal to the repository and was able to add @torenfronsdal as a reviewer.

Great, thanks @jmshapir @torenfronsdal !

ew487 commented 2 years ago

Update: I just found a problem with linpartial_var & working on debugging.

ghost commented 2 years ago

@ew487 We may want to also add a test case where we implement linpartial_var for some sample data.

ew487 commented 2 years ago

Thanks @torenfronsdal, that's a good idea! I'll work on that.

ew487 commented 2 years ago

Update: I just found a problem with linpartial_var & working on debugging.

It turns out that the bug does not have to do with linpartial_var but actually occurs when some bins have no observations (for example, when I simulate data using low N and have high nBins). @jmshapir @torenfronsdal Could you please let me know what you prefer between adding code to handle this case, or raising an error & terminating?

ghost commented 2 years ago

Update: I just found a problem with linpartial_var & working on debugging.

It turns out that the bug does not have to do with linpartial_var but actually occurs when some bins have no observations (for example, when I simulate data using low N and have high nBins). @jmshapir @torenfronsdal Could you please let me know what you prefer between adding code to handle this case, or raising an error & terminating?

@ew487 @jmshapir Maybe we should decide on a minimum number of obs. per bin and throw an error if we have fewer.

This binscatter function (by my former coworker) requires at least 10 obs. per bin.

ew487 commented 2 years ago

@torenfronsdal Ok! That makes sense. I'll add something like this. Thanks for this info!

jmshapir commented 2 years ago

@ew487 @jmshapir Maybe we should decide on a minimum number of obs. per bin and throw an error if we have fewer.

This binscatter function (by my former coworker) requires at least 10 obs. per bin.

This sounds fine to me. If we go this route, I think we should set a default minimum but let the user override if they wish.

If the resulting regression model is underidentified I think we should just let R throw whatever is the default error (presumably controlled by the underlying regression command) and terminate.

ew487 commented 2 years ago

Update: I've added a test including linpartial_var in 1a427a7, and the minimum observations per bin is implemented in f83a098 (in the Scatter function that is renamed to BinScatter in 2522030). Also, the GetMidpoint, GetMeanval, and MakeBinIndicator functions are now standalone instead of locally defined functions in BinScatter.

ew487 commented 2 years ago

Summary here.