fsspec / adlfs

fsspec-compatible Azure Datake and Azure Blob Storage access
BSD 3-Clause "New" or "Revised" License
175 stars 104 forks source link

Suggestions on how this can be used with Great Expectations? #295

Open sdebruyn opened 2 years ago

sdebruyn commented 2 years ago

Hi there 👋

Thank you for this great package! I've successfully implemented a data pipeline with Pandas that uses this package to read/write from/to Azure Data Lake Storage gen2 using Managed Identity, just beautiful!

Now I'd like to use this as well to validate my data with Great Expectations. However, their documentation is not really fantastic and I can't figure out how I should configure it to use this package.

Has anyone else used this package in combination with Great Expectations? Any suggestions on how I should configure my data source?

TomAugspurger commented 2 years ago

Do you have a code snippet to share? Most likely, you'd be using pandas' built in use of fsspec / adlfs to load data and using Great Expectations to write your assertion logic. Once the data is in pandas, adlfs isn't used any more.