pionxzh / chatgpt-exporter

Export and Share your ChatGPT conversation history
MIT License
1.29k stars 100 forks source link

Request: export JSON format #71

Closed inventor2525 closed 1 year ago

inventor2525 commented 1 year ago

Not sure if this is what you mean by json in the readme, but it looks like the API version you're working on can get the whole tree, and you're just getting to the current one.

I'd like a raw dump of all of what comes out of the API, should be really easy, once that's working.

I can get it from the XHR manually in the webpage, but it'd be nice to have a button.

pionxzh commented 1 year ago

Hi, may I know use case of this? It's easy to add an button, but I'm worrying that no one else will need it.

EDIT: tbh, I almost forgot that JSON in README 😆 We can discuss more on the use case and the possible format of the JSON.

bartybobbins commented 1 year ago

Hi, may I know use case of this?

Archival freaks, talking from experience.

ChatGPT has different responses based on the message content and how far down you've talked to it. Simply regenerating the message using the "Regenerate Message" button will work. And it can't remember whatever happened in the other conversation branches since (from what I understand) it only references messages from only that branch.

If you talked about cows at the start of the conversation, and asked about if you've mentioned cows, it'll say that you've mentioned cows. If you change the prompt at the start of the conversation that mentioned crows, an asked if you've mentioned cows, it'll say that you haven't mentioned cows.

pionxzh commented 1 year ago

I can only imagine people use JSON export for programming purpose. Like integrating it into a part of the workflow.

What kind of structure would be the most appropriate? One option is to export the raw nodes, and users will have to connect the nodes by themself (for power users with for analysis intentions). Alternatively, we could use the mapping method we've already implemented in the script, but this will result in the loss of unselected nodes. (You can still do an analysis on this thought)

EDIT: ah, there is a "whole" conversation tree in the title...

inventor2525 commented 1 year ago

Seconded for everything @bartybobbins said. (personally I'd like an offline viewer some day for when they're down or for other reasons)

Initially, I'd like to use it for search (and am now doing so by manually exporting the xhr responses). ChatGPT is only set to consume about 4000-10000 tokens (think depending on if pro or not?) and so I find it very useful when going through detailed technical stuff to describe the problem and get it into a state where it understands the current state of thought such that it isn't giving me 'childish' or overly handholdy replies, and then branch from that point to work on different aspects of things until it starts forgetting things, and then branch again. (this can make it REALLY hard to find things latter)

Eventually, I'd really like the conversation to be more redit or git branch style, with collapsible sections of conversation. Also, to be able to see when the latest comments where; if you go back and reply with something else on a different branch (I frequently do cause I'm scatterbrained like that, thinking of everything at once), it would be nice even to see simply which prompts you were at recently. (This could be as simple as a forward and back button that just jumps between prompts across the current 'conversation' or even across all of them, based on time stamp).

I also see that, someday, it might be nice to be able to control which parts of the context the model has access to, especially for larger programming tasks, or label parts of it's replies (highlight basically if it did good or bad in a certain sentence) so that people can train their own versions. Particularity for programming tasks, I'm finding that it likes to get hung up on previous revisions or proposals that it has had rather than the current prompt (or visa versa for requirements), I'd like to be able to remove those from the context and give it a new prompt or even re-train it for my use case. (Think, more advanced data labeling viewer)

Even just being able to easily tell how many prompts I'm effectively giving it a day so I can measure the feasibility of a application specific PA and their API usage costs is pretty valuable to me, and you need the whole tree for that.

That all being far out there, and obviously out of scope:

You are right, I do see this request as kind of facilitating that sort of development more than anything. But, the archival purpose has it's merits too (That's the primary reason I was digging into your code before I realized the API thing you did was still getting the kinks worked out), especially if other viewers are made or their site somehow shuts down.

I don't want you to have to add too much "clutter", like, I get that... but honestly, I look at this as kinda like a camera, you have your Jpeg, your tiff, etc, your quality and lighting and exposure settings.... and you have raw. -- Most people aren't going to shoot raw, but there's good reason to have it for those who need it.

Honestly, much larger project, but it could probably be implemented in all formats. HTML you can implement code folding. Markdown I think has extensions that enable it. PDF or Screen Shot would get hard to read, but even those you could do like what IDE's do where you have the gutter lines to show scope, and then possibly just label each scope 1,2,3, etc. Then it could simply be a single setting, do you want the whole tree, or just this branch. JSON is good for programmers and people into NLP and custom PAs, and for archival. And Text, I mean... you could do anything with that, personally I wouldn't use it for a whole tree like this, and wouldn't recommend wasting time on it, but you 'could' I guess. (maybe draw it like the tree command in Linux).

As for format... Honestly, I would just use what they already have for now, just add a json button that grabs the whole thing in their format. Mainly, for compatibility in-case other extensions add the same ability, or there is offline or alternative viewers made. And, if OpenAI ever then changes their format and we all wish we standardized on something?? Well... we're in the era of transformers people!! :stuck_out_tongue_winking_eye: They learn to transform stuff really good and fast.

pionxzh commented 1 year ago

Thank you for providing such a detailed rundown, it's really helpful to understand how you are using ChatGPT in a more powerful way than I am.

Regarding the offline viewer, that sounds like a great idea to export the whole conversation list, especially in case the history feature goes down. It's always good to have a backup plan.

Your branching idea is really cool and could be useful for people who want to train and use their customized assistant. You don't have to repeat the same prompts over and over again. However, it may not be practical in an API-based(token) world because it could become too expensive to jump between branches. Nevertheless, it's an interesting idea to explore. (Although, it is totally out of the scope of this tool 🤣)

I'll add a button to export the JSON file, and make sure the format is 90~100% similar to the API. By the way, I've noticed that OpenAI sometimes makes changes to their API. They recently removed some duplicated and non-necessary data from the API, probably for performance purposes. It's not a huge change, but yeah, we will have to keep it up. 😜

inventor2525 commented 1 year ago

:+1: Thank you! -- And awesome work btw!

Just on that side note though: The training would I think be more throughout, MAYBE you have a few personal versions, but you can fine tune GPT models, so performance isn't really a concern, they offer it already, just not through their ChatGPT version (not sure if playground does). (mind you, one day soon, they likely will, but with what we've seen with extensions to like, everything, that offers it... perhaps that remains a community driven thing with offline or even alternate online viewers permanently). -- The need to see the branches is really just for training utility in that. For instance, 1) Prompt 2) generate 3 responses 4) highlight in them good and bad parts 5) (optional) alter prompt 6) select better answer to prompt v2 to be paired with prompt 1 for fine tuning so it learns more your "style" and better understand you up front (mine for instance: trying to get it to pay much more attention to the context of the larger conversation that has lots of requirements already listed, but to make very weak quality word vomit prompts with low attention span, and expect it to pay attention to previous conversation to make sense of it). Even though if you leave some sort of requirements in the conversation that is meant to customize it, and branch from there as a part of your workflow, it doesn't really take anything performance wise because as was mentioned, it only looks at the current branch, so... it makes no difference to it. Every prompt it gets the last several thousand tokens of the current thread no matter what you do. You can see how many it's cutting you off at by using this and asking it to summarize your conversation and seeing where it's loosing you. Performance is basically fixed. The web implementation is the only limiting factor, but we're talking about json, like... realistically you're not going to generate THAT much by comparison to like, video. As long as the implementation isn't wholly "dumb", it's fine.

millennialliterarian commented 1 year ago

This is joking-not-joking-rhetorical... (for context: I'm in Ed Tech & Media Leadership aka library school Master's program taking a course on qualitative research while submitting my thesis application late, while too many other things so I want to capitalize on every effort). Mid chat where I'm prompting ChatGPT to help me develop the structure of my introduction for my thesis (which has to do with tech and COVID and is biographical - not relevant here) I realize the simultaneous qualitative study I am doing on libraries and climate action might be too much so I prompt ChatGPT...

Me: Write the introduction for the qualitative study I wish I were doing on how ChatGPT is tremendously helpful for someone with ADHD who is presently constructing their qualitative study.

Result: Introduction:

Attention Deficit Hyperactivity Disorder (ADHD) is a common neurodevelopmental disorder that affects approximately 6% to 9% of children and 2% to 5% of adults worldwide. People with ADHD often experience difficulties with organization, attention, and time management, which can make academic or research-related tasks particularly challenging. Qualitative research, in particular, requires a high level of attention to detail and organization, making it even more challenging for individuals with ADHD.

This qualitative study explores how ChatGPT, a large language model trained by OpenAI, can be a useful tool for individuals with ADHD who are constructing their qualitative study. ChatGPT is an advanced artificial intelligence system that can generate text-based responses to a wide range of prompts and questions. By using ChatGPT, individuals with ADHD can receive immediate and personalized feedback on their writing, which can help them stay organized, focused, and on track with their research.

The purpose of this study is to explore the experiences of individuals with ADHD who use ChatGPT to support their qualitative research. Specifically, the study aims to identify the ways in which ChatGPT can help individuals with ADHD overcome common challenges associated with qualitative research, such as organizing and structuring their research questions, generating ideas, and staying focused throughout the writing process.

The study will employ a qualitative approach, using semi-structured interviews to collect data from individuals with ADHD who have used ChatGPT to support their qualitative research. The interviews will explore participants' experiences with using ChatGPT, including the challenges they faced before using ChatGPT, the benefits they experienced as a result of using ChatGPT, and any limitations or drawbacks associated with this approach. The data will be analyzed using a thematic analysis approach, which will allow for the identification of key themes and patterns in the data.

Overall, this study has the potential to contribute to a deeper understanding of how technology, such as ChatGPT, can be used to support individuals with ADHD in their qualitative research endeavors. By exploring the benefits and limitations of this approach, this study can provide insights into how technology can be harnessed to support individuals with ADHD in their academic and research pursuits, ultimately helping them to achieve their goals and reach their full potential.


...so now I have the export feature on ChatGPT because I took some code I don't know how I found here on github somewhere because I have ADHD and created a login the other night for I can't remember the reason... and I gave the code to ChatGPT and told it I'm not a coder how do I use this, got step by step instructions, followed them, then wondered about how to export the whole chat history log in continuity and ended up here. So, joking not joking, beyond the whole tree, I wold request the whole tree across multiple sites. Joking because to copy + paste all this into a word doc was much faster than writing this message, let alone, I imagine, writing code. But I know nothing. Imagine. If Beachboard qualitative assignments were a tree, GPT a tree, github a tree, and reddit's GPT page a tree....I would never have to know anything and I'd have a side ache (from all the laughing). I missed the forest for the theoretical trees? Note to self: While ChatGPT is helpful for my ADHD in terms of writing development, format progress, brainstorming, soundboarding, etc., github is definitively the opposite.

Did yall already enter the contest tho? I feel like you'd win. https://cdn.openai.com/chatgpt/chatgpt-feedback-contest.pdf