Malachov / mydatapreprocessing

Load data (json, csv, parquet) web link or local file, consolidate it, do preprocessing like resampling, replacing nans, standardize etc. based on configuration.
MIT License
1 stars 0 forks source link

pd.read_json #1

Closed Servando1990 closed 3 years ago

Servando1990 commented 3 years ago

Hi, Have you considered pd.read_json() instead of json.load()?

Amazing work btw

Malachov commented 3 years ago

Hi Servando,

thank you very much for the tip...

Json library is used because pandas version have more versions of orient ('split', 'columns', 'records', 'index' ), json automatically create list or dict based on json structure and i can apply the same processing as for other dict or list and just define whether its columns or rows. Json is python built in library, so no bigger requirements.

But of course just with couple of if lines (the same as in dict / list processing) pandas version could be used as well...

Maybe the result would be more efficient - but as i've use it as preprocessing for machile learning models, that are much more power consuming, i've rather concentrated on new features than on computing.

I hope, this small library helped you... If you want to participate welcome : ] (for example if you think, that pd.read_json is good idea, feel free to anything and just ),

Daniel