ijyliu / anlp23-project

An empirical study of the costs and practicalities of prompt engineering techniques on standard and novel benchmarks
0 stars 0 forks source link

Finding the Most Common Prompt Engineering Techniques #36

Closed ijyliu closed 10 months ago

ijyliu commented 10 months ago

Create a long list of prompt engineering techniques and evaluate their online popularity. Be very careful with minor string differences - "chain-of-thought" versus "chain of thought" versus "CoT", etc.

Reddit

Just use the search function to look for number of posts, or scrape posts

Here are the largest communities:

https://www.reddit.com/r/PromptEngineering/

https://www.reddit.com/r/ChatGPT/

https://www.reddit.com/r/ChatGPTPromptGenius/

https://www.reddit.com/r/PromptDesign/

Google Trends

https://trends.google.com/trends/explore?geo=US&hl=en-US

Google Scholar

Just check the number of citations on the original paper. Potentially do citations per month since release to adjust for recency.

ijyliu commented 10 months ago

Final plan, no more screwing around.

Do this all on one day:

  1. Make internet archive of guide page, Wikipedia
  2. Scrape approaches from the guide
  3. Create list based on Wikipedia papers (could scrape in future)
  4. Add lists based on approaches, Wikipedia Papers, and Zotero citations
  5. Get semantic scholar citations and publication date for each paper
  6. For papers that are not found, manually look up paperId. Get citations and publication dates for those found by paperId
  7. Calculate citations per day since release and output sorted list. Leave in missing data
ijyliu commented 10 months ago

Inspiring note: the actual scrape date for semantic scholar doesn't matter as much, as long as we set the end_date to "today". The citations per day figure will be appropriately adjusted.

ijyliu commented 10 months ago

Task completed, will start a new issue for choosing methods