Currently, it extracts files to a flat directory structure.
Use the original file/path structure.
The tricky part here is how the links are updated.
We have to keep track of the depth for the links.
This is a super low priority tasks.
We probably want to create a new program that does this instead of trying to combine all of the diverse features into one.
Hierarchical Export Specification
Overview
This utility enhances the current export_watcher.py functionality by preserving Notion's hierarchical file/directory structure while cleaning filenames and maintaining proper internal markdown link relationships.
Core File Processing Rules
Filename Cleaning
All files must follow these standardized cleaning rules from export_watcher.py:
Remove date patterns:
# Remove patterns like "10 24 2024 - " from start of filename
filename = re.sub(r'\d{2}[\s_]+\d{2}[\s_]+\d{4}[\s_]*-[\s_]*', '', filename)
Remove GUID patterns:
# Remove 32-character hex strings from end of filename
filename = re.sub(r'\s+[a-f0-9]{32}$', '', filename)
class DirectoryProcessor:
def __init__(self, input_dir, output_dir):
self.input_dir = input_dir
self.output_dir = output_dir
self.filename_mapping = {} # Maps original paths to new paths
self.directory_mapping = {} # Maps original dirs to new dirs
def clean_directory_name(self, dirname):
# Remove date pattern
dirname = re.sub(r'\d{2}[\s_]+\d{2}[\s_]+\d{4}[\s_]*-[\s_]*', '', dirname)
# Remove GUID pattern
dirname = re.sub(r'\s+[a-f0-9]{32}$', '', dirname)
# Standardize format
return dirname.strip().replace(' ', '_')
def process_directory_structure(self):
"""Creates cleaned directory structure and builds mapping"""
2. PathResolver
class PathResolver:
def __init__(self, filename_mapping, directory_mapping):
self.filename_mapping = filename_mapping
self.directory_mapping = directory_mapping
def get_relative_path(self, source_file, target_file):
"""Calculate relative path between two files in hierarchy"""
def update_markdown_links(self, content, current_file_path):
"""Update all markdown links in content based on new paths"""
3. FileProcessor
class FileProcessor:
def __init__(self, directory_processor, path_resolver):
self.directory_processor = directory_processor
self.path_resolver = path_resolver
def process_file(self, input_path, output_path):
"""Process single file - clean name and update links"""
def process_all_files(self):
"""Process all files while maintaining hierarchy"""
Processing Flow
Directory Structure Creation
Scan input directory recursively
Clean directory names using rules above
Create output directory structure
Build directory mapping
Initial File Processing
Clean filenames using export_watcher.py rules
Copy files to new locations
Build comprehensive path mapping
Preserve file metadata
Link Resolution
Parse markdown files for links
Calculate new relative paths
Update links using new paths
Validate updated links
Link Processing Examples
Same Directory:
Original: [Link](10 24 2024 - Doc Two abc123.md)
Updated: [Link](Doc_Two.md)
This specification maintains compatibility with the existing export_watcher.py functionality while adding hierarchical structure preservation and enhanced link management.
Currently, it extracts files to a flat directory structure.
Use the original file/path structure. The tricky part here is how the links are updated. We have to keep track of the depth for the links. This is a super low priority tasks.
We probably want to create a new program that does this instead of trying to combine all of the diverse features into one.
Hierarchical Export Specification
Overview
This utility enhances the current export_watcher.py functionality by preserving Notion's hierarchical file/directory structure while cleaning filenames and maintaining proper internal markdown link relationships.
Core File Processing Rules
Filename Cleaning
All files must follow these standardized cleaning rules from export_watcher.py:
Remove date patterns:
Remove GUID patterns:
Standardize naming:
Directory Structure
Input Structure:
Output Structure:
Core Components
1. DirectoryProcessor
2. PathResolver
3. FileProcessor
Processing Flow
Directory Structure Creation
Initial File Processing
Link Resolution
Link Processing Examples
Same Directory:
Child Directory:
Parent Directory:
Error Handling
File Operation Errors
Link Resolution Errors
Configuration
Logging
Using export_watcher.py logging format:
Usage
Future Enhancements
Performance Optimizations
Additional Features
Integration Options
This specification maintains compatibility with the existing export_watcher.py functionality while adding hierarchical structure preservation and enhanced link management.