Closed luca-s closed 6 years ago
@twiecki max_loss is configured to allow max 5% of data loss by default: is that too strict?
I don't believe there is anything controversial in this commit, it's a pretty straightforward change. I'll merge this and we can decide later on what is the best default value for max_loss, for now It is 5%
@twiecki I fixed the issue of max_loss being too strict in this PR #210
Dropping factor data without warning the user can incur in wrong results interpretation. For this reason utils.get_clean_factor_and_forward_returns has now a new parameter 'max_loss' that controls the maximum percentage of factor data that can be dropped due to being flawed itself (e.g. NaNs), not having provided enough price data to compute forward returns for all factor values, or due to binning errors.
Also, small errors in the binning phase (utils.quantize_factor) caused by sporadic flawed data don't raise exceptions anymore if the incurred data loss is less than 'max_loss'