climate-tech-handbook / data-magic

Scripts and other fun tricks to enhance the site
MIT License
10 stars 5 forks source link

Automate content summary bullets? #40

Closed futuresoup closed 1 year ago

futuresoup commented 1 year ago

The first round of GPT content is too wordy. Many redundant phrases are used throughout the article.

Simple bullets are much better.

This can be fixed by running this GPT prompt: rewrite the following article using markdown syntax to be more succint and more like a bullet summary. Professional for readers scrolling on their phones without much time.


Here's an example result using the prompt.

The lengthy article on Biochar production turns to this after the prompt:


Overview

Significant progress has been made in developing Biochar Production to reverse climate change. Technologies such as efficient biochar production methods, carbon capture and storage, and emission-reducing agricultural practices have played a key role. Organizations like the Biochar Industry Consortium, International Biochar Initiative, Carbon Sequestration Leadership Forum, and Climate Change Mitigation Initiative have been at the forefront of this solution.

Progress Made

ArLough commented 1 year ago

@CaelanMacArthur Said they were interested in taking this on

Some of my thoughts on how we might do this are to create a new endpoint that takes a file path, and then depending on some logic summarize and replace the markdown. We do have some examples of past work with OpenAI API which might be useful generate_markdown.py

Of course, if you want to take it a different way Caelan I'm open to that too, just some of my initial thoughts

ghost commented 1 year ago

@ArLough

Appreciate the resource! Creating a new endpoint with the functionality/logic, you suggest sounds good! Almost done with the logic to take in and take a file path. Do we want the functionality to update multiple markdown files at a time or just one at a time?

ArLough commented 1 year ago

Updating multiple markdown files at once I think would be good. A way we have done similar behavior in the past is to do it by directory. So I guess in this case you would want a file directory path instead of a file path.

Just looping through a specified directory should do the trick as far as updating them.

ghost commented 1 year ago

os.walk might work in this case then. Okay, are we making this asynchronous to digest files as they come in, or is this only called if someone is trying to update multiple files?

ArLough commented 1 year ago

I think right now we want it to be just called to update multiple files that are already in the specified directory, not asynchronously

ghost commented 1 year ago

Gotcha, okay sound good!