mozilla / fathom

A framework for extracting meaning from web pages
http://mozilla.github.io/fathom/
Mozilla Public License 2.0
1.97k stars 76 forks source link

Fix up CLI and FathomFox dependencies to make `fathom train` run again #329

Open linabutler opened 1 year ago

linabutler commented 1 year ago

Hi! 👋🏼 I'm doing some prototyping with Fathom, and ran into a few dependency-related snags getting fathom train to work. This PR is an attempt to fix all of them up—with this patch stack, fathom train works for me, and prints metrics and results! 🎉

I haven't worked with (or on) Fathom before, but happy to revert or fix up any commits. The individual commit messages have some more details about the versions I chose, but here's a quick summary:

I also had to use Python 3.9.13. It looks like 3.11 is too new for the version of PyTorch that Fathom depends on—but I wasn't sure about bumping that dependency just yet.

Thanks!

/cc @gleonard-m @DimiDL