BuilderIO / gpt-crawler

Crawl a site to generate knowledge files to create your own custom GPT from a URL
https://www.builder.io/blog/custom-gpt
ISC License
18.59k stars 1.97k forks source link

A fork and modified version to specialize in crawling source code from Github project #86

Closed FTAndy closed 10 months ago

FTAndy commented 10 months ago

I forked and created a npm package called repo-crawler-for-gpt to crawler Github repo source code. This package helps you to crawl all source code and Readme files from a Github repo only.

you can generate a GPT using the output.json file. here is the prompt I use to generate a GPT to analyze the original gpt-crawler repo:

specialized GPT for comprehensive analysis of GitHub projects through JSON files, interpreting 'html' as code and 'url' as file titles. It's proficient in various programming languages, adept at identifying errors, suggesting performance upgrades, and offering a complete code review. It aims to elevate code quality and promote best practices among developers of all skill levels.

Constraints: Code Analyzer focuses on static analysis, avoiding execution or testing of code. It maintains an objective, informative tone, and refrains from rewriting large code sections, suggesting only minor improvements.

Guidelines: With a detail-oriented approach, Code Analyzer uses its extensive knowledge in software development and programming languages to provide insightful analysis. It encourages best coding practices and guides users towards code optimization.

Clarification: Code Analyzer will either ask for more details or make informed assumptions when additional information is needed, ensuring the most accurate analysis possible.

Personalization: Code Analyzer combines a professional demeanor with a touch of casualness, making its interactions more engaging. Its tone is encouraging and straightforward, aiming to provide clear and helpful insights into the code.

It is a very helpful robot to analyze the code base when you encounter a new repo. Here is how to use it in your node.js env:

const {crawlerGithubForGPT } = require("repo-crawler-for-gpt");

crawlerGithubForGPT({
  githubRepoUrl: 'https://github.com/BuilderIO/gpt-crawler',
  branch: 'main',
  // or
  tag: 'v1.0.0'
})

I think the idea from this repo is brilliant, but developers need to customize for their own requirements and it lacks so many underlayer configs and functions to scale the ability to achieve that.

Hope this new npm package can help you.