Closed praharshjain closed 2 months ago
This issue is a great way to kick-start your journey with our project, or to make a positive impact on open-source development. Jump in!
:sparkles: Thank you for your contribution! :sparkles:
Can you please assign me this ?
Hello @praharshjain, please assign this issue to me as i already worked on this kind of problem in past and has a great experience.
hey @praharshjain i want to work on this issue , as its my 1st work in ai so i really wanna work in this issue . thankyou!!please assign me
Can you please assign me this ?
Hi @AnkitaMalik22! Absolutely, we’re thrilled about your interest in our project! :rocket: Here’s the Contributing Guideline for Instill VDP to get you started on your journey! Please refer to the Contributing Guidelines for components as well. Don’t forget to link your pull request to the corresponding issue, and after your PR gets merged, please complete this form to claim your well-deserved points! If you ever have any questions or need a hand along the way, don’t hesitate to drop a message in this thread or hop into our Discord. Happy contributing! :blush::star2:
Is There an Existing Issue for This?
Project
Instill VDP
Is your Proposal Related to a Problem?
No, it is a new feature request.
Describe Your Proposed Solution
We can implement a "Web Crawler" operator that will take an initial URL & a depth (int) as input and recursively extract links from those pages up to the given depth, finally returning a list of strings (extracted URLs).
Highlight the Benefits
Such an operator will be useful for crawling and gathering online data. For example, the links captured by it can then be fed to the text extraction operator to build a knowledge base from linked documents.
Anything Else?
No response
INS-2214