anakib1 / MangoTruth

Open source infrastructure for AI plagiarism detection
4 stars 0 forks source link

Create dataset #16

Open Silence-o0 opened 4 weeks ago

Silence-o0 commented 4 weeks ago

Collect a set of PDF or DOCX files suitable for project task, ensuring variation in content types.

anakib1 commented 1 week ago

Partially done:

Collected:

https://huggingface.co/datasets/anakib1/mango-truth

anakib1 commented 1 week ago

Will also look for other data - maybe use data @AntonGog171 suggested

(@AntonGog171 please link your data here)