AMP-SCZ / outcome_calculations

Repository includes the scripts to calculate outcomes (predictors) for the AMP-SCZ study.
Apache License 2.0
0 stars 0 forks source link

ValueError: invalid literal for int() with base 10 #10

Open tashrifbillah opened 1 year ago

tashrifbillah commented 1 year ago

Way to reproduce:

python outcome_calculations.py pronet run_outcome

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  clean_df['value'] = np.round(clean_df['value'].astype(fill_type),3)
outcome_calculations.py:252: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  clean_df['value'] = clean_df['value'].astype(fill_type)
not sure what is going on with chrpps_sum10
Iteration 2: ID: BI00157
Elapsed time: 10.76 second
File /data/predict1/data_from_nda/Pronet/PHOENIX/PROTECTED/PronetBI/raw/BI00157/surveys/BI00157.Pronet.json is present
subject is arm_1 meaning chr
subject is female
Traceback (most recent call last):
  File "outcome_calculations.py", line 369, in <module>
    age_4 = baseln_df['chrdemo_age_mos2'].fillna(-900).to_numpy(dtype=int)/12
  File "/data/pnl/soft/pnlpipe3/miniconda3/envs/pnlpipe3/lib/python3.6/site-packages/pandas/core/base.py", line 845, in to_numpy
    result = np.asarray(self._values, dtype=dtype)
  File "/data/pnl/soft/pnlpipe3/miniconda3/envs/pnlpipe3/lib/python3.6/site-packages/numpy/core/_asarray.py", line 83, in asarray
    return array(a, dtype, copy=False, order=order)
ValueError: invalid literal for int() with base 10: '228.156'

Hi @nora6591 , please see if you an fix this error soon.

npenzel commented 1 year ago

Hi Tashrif,

when I did run the code this error did not occur:

Iteration 2: ID: BI00157
Elapsed time: 14.73 second
File /data/predict1/data_from_nda/Pronet/PHOENIX/GENERAL/PronetBI/processed/BI00157/surveys/BI00157.Pronet.json is present
subject is arm_1 meaning chr
subject is female
age
[19.]
Married: currently or previously married
[-900]
Married: never married 
[-900]
not sure what is going on with chrpps_sum10
Iteration 3: ID: BI00230

Could it maybe be that you ran it with a different python version? I remember that we once talked about that some type transformations could also not run with another version?

Best, Nora

tashrifbillah commented 1 year ago

No, it's the same Python. Did you finish running it for entire Pronet? If yes, that's good enough now.

tashrifbillah commented 1 year ago

Okay so here is the difference in our input: your execution used

File /data/predict1/data_from_nda/Pronet/PHOENIX/GENERAL/PronetBI/processed/BI00157/surveys/BI00157.Pronet.json is present

But my execution used:

File /data/predict1/data_from_nda/Pronet/PHOENIX/PROTECTED/PronetBI/raw/BI00157/surveys/BI00157.Pronet.json is present

We talked about using the deidentified jsons at PHOENIX/GENERAL/*/processed/*/surveys/*.Pronet.json. Why is it that your code /data/predict1/home/np487/outcome_calculations/outcome_calculations.py that I ran used the raw jsons?


There were some changes in the date offset file. It will affect the date outcomes in psychs.csv. So now we need to re-run your program before exporting data to NDA. This re-run should have happened yesterday.