Closed tribixbite closed 3 months ago
Thank you for contributing!
I'm afraid I will need to roll back this merged pull request. It no longer copies the output to the clipboard and it adds additional prompts for the user when I would prefer it to be only one URL or path passed with the program intuitively handling it properly
I'll make a new one that keeps your preferences by default and lets the user set config flags to turn on/off features. I opened this before making the third commit
For doing a lot of extraction I added some custom naming for output files, with token count so you can use the files later without having to remember how big they are.... gpt-aided summary:
Subdirectory Output Organization: Modified the script to create output files (
_full_output.txt
,_min_output.txt
, and_processed_urls.txt
) within a dynamically named subdirectory underoutput/
, based on the input source name. This change aids in maintaining a cleaner working directory and better organizes outputs, especially when processing multiple sources.Dynamic Filename Convention: Updated the filename convention to
{base_name}_{token_count}_{type}.txt
for both uncompressed (full
) and compressed (min
) output files, where{type}
reflects the file's content. This update makes it easier to identify files by their source and content status, providing quick insights into the token count directly from the filename.README Updates: Revised the README file to reflect these changes, ensuring users are fully informed about the tool's functionality, usage, and the new file naming and organization scheme. The documentation now includes updated instructions and clarifies the output file structure, enhancing the tool's accessibility to new users.
Reason for Changes:
Improved File Management: By organizing output files into subdirectories, users can more easily manage their workspace, especially when dealing with multiple data sources. This organization prevents clutter and makes it straightforward to locate and distinguish outputs from different inputs.
Enhanced File Naming: Incorporating the token count and content type into filenames provides immediate context about each file's contents and processing state without needing to open the file. This naming convention is particularly useful for users working with large datasets and needing quick file identification.