davidmigloz / langchain_dart

Build LLM-powered Dart/Flutter applications.
https://langchaindart.dev
MIT License
414 stars 73 forks source link

Get github readme content from url are blocked by github using langchain_dart #181

Closed lucasjinreal closed 11 months ago

lucasjinreal commented 11 months ago

System Info

flutter latest

Related Components

Reproduction

I am using langchain dart WebLoader to load text from url.

for github url, the first attempt is ok, but after that, every time it was returns this:

 OpenAIChatCompletionChoiceMessageModel(role: OpenAIChatMessageRole.user, content: Skip to contentToggle navigationSign upProductActionsAutomate any workflowPackagesHost and manage packagesSecurityFind and fix vulnerabilitiesCodespacesInstant dev environmentsCopilotWrite better code with AICode reviewManage code changesIssuesPlan and track workDiscussionsCollaborate outside of codeExploreAll featuresDocumentationGitHub SkillsBlogSolutionsForEnterpriseTeamsStartupsEducationBy SolutionCI/CD & AutomationDevOpsDevSecOpsResourcesLearning PathwaysWhite papers, Ebooks, WebinarsCustomer StoriesPartnersOpen SourceGitHub SponsorsFund open source developersThe ReadME ProjectGitHub community articlesRepositoriesTopicsTrendingCollectionsPricingSearch or jump to...Search code, repositories, users, issues, pull requests...SearchClearSearch syntax tipsProvide feedbackWe read every piece of feedback, and take your input very seriously.Include my email address so I can be contactedCancelSubmit feedbackSaved searchesUse saved searches to filter your results more quicklyNameQueryTo see all available qualifiers, see ourdocumentation.CancelCreate saved searchSign inSign in to GitHubUsername or email addressPasswordForgot password?or sign in with a passkeySign upYou signed in with another tab or window.Reloadto refresh your session.You signed out in another tab or window.Reloadto refresh your session.You switched accounts on another tab or window.Reloadto refresh your session.Dismiss alert{{ message }}Find code, projects, and people on GitHub:SearchContact Support—GitHub Status—@githubstatusSubscribe toThe GitHub InsiderDiscover tips, technical guides, and best practices in our monthly newsletter for developers.SubscribeProductFeaturesEnterpriseCopilotSecurityPricingTeamResourcesRoadmapCompare GitHubPlatformDeveloper APIPartnersElectronGitHub DesktopSupportDocsCommunity ForumProfessional ServicesPremium SupportSkillsStatusContact GitHubCompanyAboutCustomer storiesBlogThe ReadME ProjectCareersPressInclusionSocial ImpactShopGitHub on XGitHub on FacebookGitHub on LinkedInGitHub on YouTubeGitHub on TwitchGitHub on TikTokGitHub’s organization on GitHub© 2023 GitHub, Inc.TermsPrivacy(Updated 08/2022)SitemapWhat is Git?You can’t perform that action at this time.

Expected behavior

this is the blank jump link for github, Please add some complicated rule so that github won't treat it as a scraper?

davidmigloz commented 11 months ago

hey @lucasjinreal,

Thanks for reporting the issue.

I don't think there is anything I can do about it. If you want to load files from GitHub, you can create your custom document loader that uses the GitHub API instead. That way you can avoid the scraping protection (and improve the performance and accuracy).