opendatalab / magic-doc

Apache License 2.0
296 stars 22 forks source link

怎么使用本项目提取网页呢? #17

Open rangehow opened 1 month ago

icecraft commented 1 month ago

try https://github.com/opendatalab/magic-html

XiyueSun commented 1 month ago

try https://github.com/opendatalab/magic-html

hello, 这个repo 是实现了 Magic-Doc中描述的web page extraction效果么?

icecraft commented 1 month ago

try https://github.com/opendatalab/magic-html

hello, 这个repo 是实现了 Magic-Doc中描述的web page extraction效果么? only support (PPT/PPTX/DOC/DOCX/PDF) now

drunkpig commented 1 month ago

try https://github.com/opendatalab/magic-html

hello, 这个repo 是实现了 Magic-Doc中描述的web page extraction效果么?

yes, you're right, magic-html is used to extract mixed web pages