Open Bytes-Explorer opened 2 months ago
@Bytes-Explorer I would like to work on this card. Will take the usual time of two weeks for this card.
Sounds good!
@SowmyaLR Please don't hesitate to reach out if you run into any issues related to the framework as you build this transform. How familiar are you with the PDF2Parquet transform ? it creates a row in a parquet from Markdown sections of the document.
Hi @touma-I I have done research around grammar correction. Need to start this transform by this week. Thank you for the information about PDF2Parquet file detail. I need to check on this and will get back for further queries.
Hi @Bytes-Explorer @touma-I I have a few questions about this task
@Bytes-Explorer @touma-I any updates on the above questions?
Sorry missed this @SowmyaLR
Search before asking
Component
Transforms/Other
Feature
Implement a new feature to detect and eliminate grammar, punctuation or spelling from a given text. This functionality should work on every row of a parquet file, where every row contains one document. The output should be True or False, and this should be added as an output column along with span of the text where errors are detected along with the corrected text.
This can be added as a new transform for text/NLP data. One can refer to code quality module as a reference for how filters have been applied for code data.
Are you willing to submit a PR?