The current catalog tools will load the whole CSV into memory and sometimes update it. If another process edits the CSV, this is not captured, leading to potentially corrupted catalogs, or simply lost lines.
Potential Solution
Could we make the ProjectCatalog self-aware with something like watchdog ? The object would automatically update if the CSV changes. Conversely, it should automatically update the CSV each time the DataFrame is updated.
Additional context
(We are converging towards a real database here, Christian was right all this time. Damn scientists acting like programmers.)
Contribution
[ ] I would be willing/able to open a Pull Request to contribute this feature.
Addressing a Problem?
The current catalog tools will load the whole CSV into memory and sometimes update it. If another process edits the CSV, this is not captured, leading to potentially corrupted catalogs, or simply lost lines.
Potential Solution
Could we make the
ProjectCatalog
self-aware with something like watchdog ? The object would automatically update if the CSV changes. Conversely, it should automatically update the CSV each time the DataFrame is updated.Additional context
(We are converging towards a real database here, Christian was right all this time. Damn scientists acting like programmers.)
Contribution