datamol-io / datamol

Molecular Processing Made Easy.
https://docs.datamol.io
Apache License 2.0
452 stars 47 forks source link

Load/save dataframe #163 #182

Closed dessygil closed 1 year ago

dessygil commented 1 year ago

Taking the code provided and added to test cases to load and save data frames. The code was placed into the datamol/utils file. This can be referenced at #163

Checklist:


dessygil commented 1 year ago

It is almost good to go! but I just have one questions

When I run the test for the parquet file, I am returned with a missing dependency error:

E           A suitable version of pyarrow or fastparquet is required for parquet support.
E           Trying to import the above resulted in these errors:
E            - Missing optional dependency 'pyarrow'. pyarrow is required for parquet support. Use pip or conda to install pyarrow.
E            - Missing optional dependency 'fastparquet'. fastparquet is required for parquet support. Use pip or conda to install fastparquet.

Everything else works fine but I'm not sure what to do here unless I can add the dependency

hadim commented 1 year ago

It is almost good to go! but I just have one questions

When I run the test for the parquet file, I am returned with a missing dependency error:

E           A suitable version of pyarrow or fastparquet is required for parquet support.
E           Trying to import the above resulted in these errors:
E            - Missing optional dependency 'pyarrow'. pyarrow is required for parquet support. Use pip or conda to install pyarrow.
E            - Missing optional dependency 'fastparquet'. fastparquet is required for parquet support. Use pip or conda to install fastparquet.

Everything else works fine but I'm not sure what to do here unless I can add the dependency

You can add pyarrow in the # Optional deps section in the env.yml file. It should be optional within the source code and not fail if a user does not have such dep.

hadim commented 1 year ago

Don't forget to add a NEWS file as well please.

codecov[bot] commented 1 year ago

Codecov Report

Merging #182 (afda6d7) into main (c53455b) will increase coverage by 0.11%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #182      +/-   ##
==========================================
+ Coverage   91.57%   91.69%   +0.11%     
==========================================
  Files          46       46              
  Lines        3622     3661      +39     
==========================================
+ Hits         3317     3357      +40     
+ Misses        305      304       -1     
Flag Coverage Δ
unittests 91.69% <100.00%> (+0.11%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
datamol/io.py 94.35% <100.00%> (+1.41%) :arrow_up:

... and 1 file with indirect coverage changes

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

hadim commented 1 year ago

@dessygil I just merged a PR in main. Can you rebase or merge main on this PR? You'll have to modify the imports in datamol/__init__.py and follow the new logic (I can help you once you fixed the merge conflict).

hadim commented 1 year ago

Thank you, @dessygil, for your contribution. I went ahead and pushed the latest fixes myself. I will merge once green.