taylorai / galactic

data cleaning and curation for unstructured text
Apache License 2.0
327 stars 15 forks source link

Sweep: Add documentation for src/galactic/loaders.py to the README.md #1

Closed andersonbcdefg closed 1 year ago

andersonbcdefg commented 1 year ago

Currently, the "Loading Data" section of the README is incomplete. Update it with well-documented methods and examples.

Checklist - [X] `README.md` ✅ Commit [`2040a39`](https://github.com/taylorai/galactic/commit/2040a390f9f59cc86ce21ead7395dc949bd54396)
• Add a brief explanation of the purpose of the src/galactic/loaders.py file at the beginning of the "Loading Data" section. • Add a detailed explanation and example for the from_csv method in the "Loading Data" section. Explain what it does, the parameters it takes, and provide an example of how to use it. • Add a detailed explanation and example for the from_jsonl method in the "Loading Data" section. Explain what it does, the parameters it takes, and provide an example of how to use it. • Add a detailed explanation and example for the from_pandas method in the "Loading Data" section. Explain what it does, the parameters it takes, and provide an example of how to use it. • Add a detailed explanation and example for the from_hugging_face method in the "Loading Data" section. Explain what it does, the parameters it takes, and provide an example of how to use it. • Add a detailed explanation and example for the from_hugging_face_stream method in the "Loading Data" section. Explain what it does, the parameters it takes, and provide an example of how to use it. • Add a detailed explanation and example for the from_disk method in the "Loading Data" section. Explain what it does, the parameters it takes, and provide an example of how to use it.
Sandbox Execution Logs
trunk init 1/3 ✅
⡿ Downloading Trunk 1.15.0...
⡿ Downloading Trunk 1.15.0...
⢿ Downloading Trunk 1.15.0...
⣻ Downloading Trunk 1.15.0...
⣽ Downloading Trunk 1.15.0...
⣾ Downloading Trunk 1.15.0...
⣷ Downloading Trunk 1.15.0...
✔ Downloading Trunk 1.15.0... done
⡿ Verifying Trunk sha256...
✔ Verifying Trunk sha256... done
⡿ Unpacking Trunk...
✔ Unpacking Trunk... done

✔ 13 linters were enabled (.trunk/trunk.yaml)
  bandit 1.7.5 (9 python files)
  black 23.9.1 (3 jupyter, 9 python files)
  checkov 2.4.9 (2 yaml files)
  git-diff-check (22 files)
  isort 5.12.0 (9 python files) (created .isort.cfg)
  markdownlint 0.36.0 (1 markdown file) (created .markdownlint.yaml)
  osv-scanner 1.4.0 (1 lockfile file)
  prettier 3.0.3 (1 markdown, 2 yaml files)
  ruff 0.0.289 (9 python files) (created ruff.toml)
  taplo 0.8.1 (1 toml file)
  trivy 0.45.0 (1 lockfile, 2 yaml files)
  trufflehog 3.55.1 (22 files)
  yamllint 1.32.0 (2 yaml files) (created .yamllint.yaml)
Next Steps
 1. Read documentation
    Our documentation can be found at https://docs.trunk.io
 2. Get help and give feedback
    Join the Trunk community at https://slack.trunk.io
trunk fmt README.md 2/3 ✅

 ✔ Formatted README.md
Re-checking autofixed files...

Checked 1 file
✔ No issues
trunk check --fix README.md 3/3 ✅

Checked 1 file
✔ No issues
sweep-ai[bot] commented 1 year ago

Here's the PR! https://github.com/taylorai/galactic/pull/2.

⚡ Sweep Free Trial: I used GPT-4 to create this ticket. You have 4 GPT-4 tickets left for the month and 2 for the day. For more GPT-4 tickets, visit our payment portal. To retrigger Sweep, edit the issue.


Step 1: 📍 Planning

I found the following snippets in your repository. I will now analyze these snippets and come up with a plan.

Some code snippets I looked at (click to expand). If some file is missing from here, you can mention the path in the ticket description. https://github.com/taylorai/galactic/blob/1972d199a8d78c5c2f923388eb660da9ed044db7/src/galactic/loaders.py#L1-L81 https://github.com/taylorai/galactic/blob/1972d199a8d78c5c2f923388eb660da9ed044db7/README.md#L1-L49

Step 2: ⌨️ Coding

trunk init 1/3 ✅
⡿ Downloading Trunk 1.15.0...
⡿ Downloading Trunk 1.15.0...
⢿ Downloading Trunk 1.15.0...
⣻ Downloading Trunk 1.15.0...
⣽ Downloading Trunk 1.15.0...
⣾ Downloading Trunk 1.15.0...
⣷ Downloading Trunk 1.15.0...
✔ Downloading Trunk 1.15.0... done
⡿ Verifying Trunk sha256...
✔ Verifying Trunk sha256... done
⡿ Unpacking Trunk...
✔ Unpacking Trunk... done

✔ 13 linters were enabled (.trunk/trunk.yaml)
  bandit 1.7.5 (9 python files)
  black 23.9.1 (3 jupyter, 9 python files)
  checkov 2.4.9 (2 yaml files)
  git-diff-check (22 files)
  isort 5.12.0 (9 python files) (created .isort.cfg)
  markdownlint 0.36.0 (1 markdown file) (created .markdownlint.yaml)
  osv-scanner 1.4.0 (1 lockfile file)
  prettier 3.0.3 (1 markdown, 2 yaml files)
  ruff 0.0.289 (9 python files) (created ruff.toml)
  taplo 0.8.1 (1 toml file)
  trivy 0.45.0 (1 lockfile, 2 yaml files)
  trufflehog 3.55.1 (22 files)
  yamllint 1.32.0 (2 yaml files) (created .yamllint.yaml)
Next Steps
 1. Read documentation
    Our documentation can be found at https://docs.trunk.io
 2. Get help and give feedback
    Join the Trunk community at https://slack.trunk.io
trunk fmt README.md 2/3 ✅

 ✔ Formatted README.md
Re-checking autofixed files...

Checked 1 file
✔ No issues
trunk check --fix README.md 3/3 ✅

Checked 1 file
✔ No issues


Step 3: 🔁 Code Review

I have finished reviewing the code for completeness. I did not find errors for sweep/add-documentation-loaders.

.


🎉 Latest improvements to Sweep:


💡 To recreate the pull request edit the issue title or description. To tweak the pull request, leave a comment on the pull request. Join Our Discord