NVIDIA / NeMo-Curator

Scalable data pre processing and curation toolkit for LLMs
Apache License 2.0
478 stars 57 forks source link

[Tutorials] Add a tutorial for PEFT data curation #45

Closed Maghoumi closed 4 months ago

Maghoumi commented 5 months ago

This PR adds a new tutorial to demonstrate data curation for PEFT use-cases.

Maghoumi commented 4 months ago

@ryantwolf marking as ready for review, but we're still facing the null bug.

ryantwolf commented 4 months ago

Null bug should be fixed in #55

Maghoumi commented 4 months ago

I rebased onto main (after #55 was merged) and I still see null entries. Do I need to modify something on my end to get this to work?

ryantwolf commented 4 months ago

@Maghoumi did you reinstall after rebase? I'm running on your branch right now and not seeing any nulls.

Maghoumi commented 4 months ago

Oops my bad, old habits! It works now

On Mon, May 6, 2024 at 8:10 PM Ryan Wolf @.***> wrote:

@Maghoumi https://github.com/Maghoumi did you reinstall after rebase? I'm running on your branch right now and not seeing any nulls.

— Reply to this email directly, view it on GitHub https://github.com/NVIDIA/NeMo-Curator/pull/45#issuecomment-2097363307, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABWIKBC2SGTCNNHQ4QUEOC3ZBBAYRAVCNFSM6AAAAABG7MNPFOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJXGM3DGMZQG4 . You are receiving this because you were mentioned.Message ID: @.***>

arhamm1 commented 4 months ago

[like] Arham Mehta reacted to your message:


From: Mehran Maghoumi @.> Sent: Tuesday, May 7, 2024 3:11:45 AM To: NVIDIA/NeMo-Curator @.> Cc: Subscribed @.***> Subject: Re: [NVIDIA/NeMo-Curator] [Tutorials] Add a tutorial for PEFT data curation (PR #45)

Oops my bad, old habits! It works now

On Mon, May 6, 2024 at 8:10 PM Ryan Wolf @.***> wrote:

@Maghoumi https://github.com/Maghoumi did you reinstall after rebase? I'm running on your branch right now and not seeing any nulls.

— Reply to this email directly, view it on GitHub https://github.com/NVIDIA/NeMo-Curator/pull/45#issuecomment-2097363307, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABWIKBC2SGTCNNHQ4QUEOC3ZBBAYRAVCNFSM6AAAAABG7MNPFOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJXGM3DGMZQG4 . You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHubhttps://github.com/NVIDIA/NeMo-Curator/pull/45#issuecomment-2097364670, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BBVYZYVM4WHUZI4Y2WOJE33ZBBA7DAVCNFSM6AAAAABG7MNPFOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJXGM3DINRXGA. You are receiving this because you are subscribed to this thread.Message ID: @.***>

Maghoumi commented 4 months ago

@ryantwolf I addressed the comments that I could and pushed all the changes. Let me know if there's anything else needed before merging.