lanl / dsi

LANL Data Science Infrastructure Project
https://lanl.github.io/dsi
5 stars 3 forks source link

Add FileConsumer superclass with Bueno and CSV #62

Closed DanielRJohnson closed 1 year ago

DanielRJohnson commented 1 year ago

Creates a FileConsumer superclass that tracks file absolute path and file hash. Also adds a basic CSV Plugin and moves the Bueno Plugin to be under FileConsumer.

Still some things to think about:

qwofford commented 1 year ago

resolves #59

qwofford commented 1 year ago

What do we have to do to get this PR in main still?

DanielRJohnson commented 1 year ago

I believe the CSV plugin adding more rows than other plugins leaves Terminal.active_metadata malformed still, needing padding of None's or forward-fill. Will look into that now

DanielRJohnson commented 1 year ago

Added None packing in Terminal.transload, other strategies can be added later if helpful. See changes in transload and new test in test_file_consumer.