RMI-PACTA / PACTA_analysis

Run the PACTA analysis on EQ & CB portfolios
Other
25 stars 70 forks source link

add more robust validation and fixing of portfolio data input #379

Open Clare2D opened 3 years ago

Clare2D commented 3 years ago

One report wasn't generating because of the following errors:

-- web_tool_script_1.R --------------------------------------------------------- There were 14 warnings (use warnings() to see them) Error: Problem with mutate() input value_usd. x non-numeric argument to binary operator i Input value_usd is if_else(...). Backtrace: x

  1. +-global::process_raw_portfolio(...)
  2. | -global::calculate_value_usd_with_fin_data(portfolio)
  3. | -%>%(...)
  4. | +-base::withVisible(eval(quote(_fseq(_lhs)), env, env))
  5. | -base::eval(quote(_fseq(_lhs)), env, env)
  6. | -base::eval(quote(_fseq(_lhs)), env, env)
  7. | -_fseq(_lhs)
  8. | -magrittr::freduce(value, _function_list)
  9. | +-base::withVisible(function_list[k])
    1. | -function_list[k]
    2. | +-dplyr::mutate(...)
    3. | -dplyr:::mutate.data.frame(...)
    4. | -dplyr:::mutate_cols(.data, ...)
    5. | +-base::withCallingHandlers(...)
    6. | -mask$eval_all_mutate(dots[[i]])
    7. +-dplyr::if_else(...)
    8. -base::.handleSimpleError(...)
    9. -dplyr:::h(simpleError(msg, call)) Execution halted

This seems to me like an issue in the number formatting of the portfolio file (Swiss). Something to test with the new input reader/audit functionality, hopefully meaning this error doesn't appear again. Just wanted to flag. I can provide the portfolio input file that created this.

FYI @cjyetman

cjyetman commented 3 years ago

pretty sure this happens when a portfolio has weird number formatting in either number_of_shares or unit_share_price

it’s triggered here https://github.com/2DegreesInvesting/PACTA_analysis/blob/e7f8ca4e182282425f2d4bc2a8617682c4ecc1c3/0_portfolio_input_check_functions.R#L542-L548

I'm wondering if maybe we should add in a forced coercion of the columns to numeric when the portfolio is read in here? It won't "fix" the problem (because much of the portfolio's data will be coerced to NA), but at least it won't trigger a difficult to debug error later on. I'm not 100% sure if that's a good idea. https://github.com/2DegreesInvesting/PACTA_analysis/blob/496f7936b5cec7c29ff505cd0e4948e50d15d159/0_web_functions.R#L194-L230

cjyetman commented 3 years ago

Alternatively, a fancier pre-processing of the portfolio CSVs needs to be done... either earlier on in our code, or ideally on the server when the portfolio is uploaded.

cjyetman commented 2 years ago

changed issue name and this issue will be used to track a general task of improving validation and fixing of portfolio data when it is first read in

some things that should likely be checked and dealt with: