chaoyi-wu / PMC-LLaMA

The official codes for "PMC-LLaMA: Towards Building Open-source Language Models for Medicine"
549 stars 52 forks source link

Content filtering of books #20

Open maldivesxue opened 8 months ago

maldivesxue commented 8 months ago

Thank you for your excellent work which inspires us greatly. I am curious about the de-duplication and content filtering of books. Could I know which library did you use or do you have any plans to open source the correponding code? Thanks in advance.

chaoyi-wu commented 7 months ago

Hello, the books we used are listed here, Because of the license, I cannot share the exact contents with you, you may collect them online.