Nougat weights are CC-BY-NC

staghado commented 1 day ago

I don't see how can the weights be apache-2.0 while some of the data used to train the model is CC-BY-NC(the Nougat subset for example). Thanks for your clarification.

Ucas-HaoranWei commented 1 day ago

We do not use the Nougat weights and data.
Are Nougat's data open-source？ To my knowledge, they don't have open-source data, so it's impossible to use them.
The data we use for training is completely different from Nougat, and we process the mathpix format.
Nougat only inspired us to use Arxiv's LaTeX format to process data

staghado commented 23 hours ago

thanks for the clarification. I was confused and thought Nougat was used for the annotation. it’s clearer now. great work!

On Sun 27 Oct 2024 at 01:59, WeiHaoran @.***> wrote:

We do not use the Nougat weights and data.

Are the data of Nougat open-source？ To my knowledge, they don't have open source the data, so it's impossible to use them.

We drew inspiration from Nougat's approach to processing LaTeX data and created data for training GOT.

The data we use for training is completely different from nougat, and the format we process is mathpix format.

— Reply to this email directly, view it on GitHub https://github.com/Ucas-HaoranWei/GOT-OCR2.0/issues/158#issuecomment-2439771827, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUBGX5HHVILIX6A3BVV5NSLZ5QUHZAVCNFSM6AAAAABQVFNVOSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZZG43TCOBSG4 . You are receiving this because you authored the thread.Message ID: @.***>

Ucas-HaoranWei / GOT-OCR2.0

Nougat weights are CC-BY-NC #158