wp-cli / export-command

Exports WordPress content to a WXR file.
MIT License
12 stars 27 forks source link

Add a flag to omit repetitive metadata from subsequent export files #76

Closed jmdodd closed 1 year ago

jmdodd commented 3 years ago

Feature Request

Describe your use case and the problem you are facing

When limited by max_file_size, sites with a large quantity of "before_posts" data will produce a series of WXR files that could in theory be 100% meta and never reach the point of exporting posts. In the more egregious cases I've seen, 20-50% of each WXR file is taken up with this duplicate metadata.

Before posts:

        $available_sections = [
            'header',
            'site_metadata',
            'authors',
            'categories',
            'tags',
            'nav_menu_terms',
            'custom_taxonomies_terms',
            'rss2_head_action',
        ];

Describe the solution you'd like

I would like to work on code to de-deduplicate metadata amongst generated files to save space and processing time and to facilitate the export of extremely large sites. I've filed this as a feature request to see if this use case/new option would be compatible with the export-command.

schlessera commented 3 years ago

Yes, I'm open to adding deduplicating logic like that. We can add it as a flag at first and then later decide what the default state of that flag should be as regards to BC.