Open aaronsteers opened 5 months ago
@aaronsteers I am interested in working on this and also willing to work on #31 which is closely related to this! Please assign it to me!
Awesome! You are the first to chime in so I think this one is yours! Can you also drop a comment in the other issue. (GitHub won't let me assign otherwise.)
@aaronsteers I've started working on this issue and started buiilding a connector for hugging face datasets in python cdk. But I just wanted to make sure if this issue and #31 are part of feature contributions because recently I was not assigned #20 in quickstarts (probably due to confusion as these issues #30 , #31 are in No Hackathon category currently).I had been waiting to get it assigned since past 5 days! Even before I had got this assigned!
Hi, @ombhardwajj . I apologize for any confusion. I've put this and #31 into the Feature Contributions
categories.
Do you need any assist on this item or on #31?
@aaronsteers Thanks for the concern. Regarding #31, I am first going to solve for this issue then I'll start solving #31. Currently I am facing some dependency "conflicts", so I was thinking of shifting to lowcode instead of Python cdk does that work with you? Otherwise I'll give it another try...
Over the past week, I tried to build this but, unfortunately, I have been facing some errors. Despite my efforts to resolve them, I have not been successful. Therefore, I am un-assigning myself from this issue.
Hi @aaronsteers, can i work on this issue?
@ombhardwajj - I understand. Thanks for looping back.
@bala-ceg - If you still are wanting to pick this up, it is yours. 👍
@marcosmarxm @aaronsteers can you please let me know which connector development method i should follow - python cdk or lowcode cdk
Low-code if possible but if it isn't you need to you Python CDK
Overview
This blog post came out 2 weeks ago, announcing a new feature where DuckDB can now extract from hugging face datasets using the
hf://
URI prefix.We think this would make an awesome connector for users in our community.
https://duckdb.org/2024/05/29/access-150k-plus-datasets-from-hugging-face-with-duckdb.html
Technical spec
You would write a new source connector which can connect to Hugging Face source datasets and emit records from them, allowing Airbyte users to send these to any Airbyte destination.
Notes:
Cache
andSQLProcessor
.31
Definition of Done