Closed yasirroni closed 1 year ago
Maybe option --remove-non-cells
?
If you are interested, I might be able to help. Thanks.
This is the target notebooks.
https://github.com/yasirroni/nb-clean/blob/remove_non_cells/tests/notebooks/clean_only_cells.ipynb
The bare minimum that I found for notebook to be able to be rendered is:
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"Hello, world\")"
]
}
],
"metadata": {},
"nbformat": 4,
"nbformat_minor": 2
}
metadata
, nbformat
and nbformat_minor
must remain as they're required fields as per the schema.
I'd prefer we expressed this as --remove-{thing to remove}
rather than --remove-non-{thing to keep}
as it composes better with existing options, and is more robust to changes to the notebook schema in future which add additional fields.
Then, we can make it simple by cleaning the contents of metadata
. About the nbformat and nbformat_minor, I don't know the best default value for all. Maybe shouldn't touch it for now.
The conclusion then, support --remove-notebook-metadata
, making it {}
.
nbformat
and nbformat_minor
shouldn't be mutated. When we read the notebook with nbformat.read
, we pass version=nbformat.NO_CONVERT
to prevent version conversion.
nbformat
andnbformat_minor
shouldn't be mutated. When we read the notebook withnbformat.read
, we passversion=nbformat.NO_CONVERT
to prevent version conversion.
Agreed.
So, do you support if I add clean metadata? Because it seems not yet supported.
--remove-notebook-metadata
make metadata
value {}
.
This is an issue also for me. It seems that no contents of metadata
are mandatory per schema.
E.g. /metadata/kernelspec/display_name
may get changed easily even on the same system. So it's very disturbing to have such data versioned.
I use VS Code and it also dumps
"metadata": {
...
"vscode": {
"interpreter": {
"hash": "<some_hash_string>"
}
}
}
Also a good candidate for filtering.
Alternatively (or additionally), nb-clean could perhaps be even more flexible and support e.g. YAML paths. So that one could specify e.g. nb-convert --remove-path=/metadata/vscode
or something like that.
Hi @haplav, I implement --remove-notebook-metadata
in https://github.com/srstevenson/nb-clean/pull/169.
But, after I'm using nbQA, I think the best approach is to add:
"metadata": {
"language_info": {
"name": "python"
}
Do you have any suggestion to improve that PR? Thank you,
This issue was closed due to inactivity. Please reopen if still relevant.
What do you think about only preserve
cells
? It means that it will clean all except 'cells`?In the notebook example, it will destroy: