facebookresearch / nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents
https://facebookresearch.github.io/nougat/
MIT License
8.93k stars 565 forks source link

Maintenance status; community fork #242

Open SichangHe opened 2 months ago

SichangHe commented 2 months ago

It seems that Meta is not interested in this research project any more and it is not maintained, e.g., bugs like #232 that renders the project unusable are not fixed.

If so, we need to fork this project and maintain it as a community. This is the best PDF-to-Markdown converter I know of; it is much easier to maintain a community fork than to write a new one from scratch.

Thoughts please, @breisfeld, @YuCheng-Qi, @williamparson, @YodaGitMaster, @Halle-Astra, @she3o, @MaksimMrvica-plus, @hongyi-zhao, @sparsh35, @ivanmladek, @S1M0N38, @itsaphel, @bryanyzhu, @luckymore, @fqassemi, @robertobagnato, @Vidminas? (Randomly pinging some people who seem to have been active.)

ivanmladek commented 2 months ago

I'm down. We found out Nougat to be the best PDF exporter and would be a shame to let it die.

S1M0N38 commented 2 months ago

I've used nougat and it worked quite well for my purpose. However I'm not planning to use it in the foreseeable future and at the moment I do not have much spare time to invest in it.

SichangHe commented 2 months ago

(Mentioning some people who forked.) @sidharthrajaram, @tracyqwerty, @stanlitoai, @nhorlava, @Sparklizm, @taxom-techlead, @mtfranzen, @perryzjc, @cannin, @JTran-IDM, @KPHippe, @sairin94, @arielweinberger, @mambisi, @JasonKitty.

SichangHe commented 2 months ago

@lukas-blecher, do we need to use another name for legal?

ivanmladek commented 2 months ago

https://github.com/tracyqwerty/nougat already forked and fixed most of the bugs. might be good to continue there cc @tracyqwerty

SichangHe commented 2 months ago

https://github.com/tracyqwerty/nougat already forked and fixed most of the bugs. might be good to continue there cc @tracyqwerty

Have you tried it? The changes does not look like it would fix bugs to me.

Also, we need actual experienced and careful people to maintain this; otherwise it would quickly become a giant mess.

SichangHe commented 1 month ago

@lukas-blecher, do we need to use another name for legal?

I am going to call it nutted if there are not better ones.

fqassemi commented 1 month ago

A lot can be added for both performance and also for quality with more diverse data.