gap-packages / FactInt

FactInt -- Advanced Methods for [Fact]oring [Int]egers
https://gap-packages.github.io/FactInt/
GNU General Public License v2.0
4 stars 4 forks source link

Update the factor databases #14

Open Stefan-Kohl opened 6 years ago

Stefan-Kohl commented 6 years ago

The factor databases shipped with FactInt have not been updated since years. For example, the database of factors of integers of the form b^k +/- 1 has been updated for the last time in June 2011. Depending on how much people need additional factorizations of such numbers which have been found since then in the corresponding factorization projects, one could update the databases again. Of course one needs to take into account the additional storage consumption when storing more factors.

By the way, the URL of the database of factors of integers of the form b^k +/- 1 has changed since its mentioning in lib/general.gi. It is now http://myfactors.mooo.com/ .

frankluebeck commented 3 years ago

A slightly improved FetchBrentFactors and an update of the data is provided in the pull request "Update Brent factors" (#23).

This data collection has grown considerably since its last update in FactInt. In my applications I only need the factors of p^k - 1 where p is a prime. But the other factors may be useful in other applications. The source of the data is growing and updated by Jonathan Crombie every few months. I'm not sure if storage space is a concern nowadays.

olexandr-konovalov commented 3 years ago

Oh that's nice @frankluebeck, thanks!

olexandr-konovalov commented 3 years ago

One can possibly make .gz of all files in tables/brent to address the problem of growing space (CC @fingolfin)

frankluebeck commented 3 years ago

Yes, gzipping all files in that directory shrinks it from 350 MB to 130 MB. But this does not shrink the distribution archive. Is the distribution size really a problem today? The applications I'm aware of are finding multiplicative orders of finite field elements and, more generally, matrices over finite fields. For this it would be sufficient to have the Brent data for prime bases (numbers p^k-1, p prime). These data need 46 MB (17 MB gzipped).

One could download and cache the files only when they are needed, similar to data in AtlasRep and maybe provide a utility function to download everything. But that is more work. (This approach could be useful for other packages as well).

olexandr-konovalov commented 2 years ago

If we want to make a new release incorporating PR #23, we should be ready to the 20-fold increase of the package distribution size. FactInt-master.zip is now 116 Mb, and tar.gz will be smaller, but still a lot. To compare, FactInt-1.6.3.zip was 5 Mb.

fingolfin commented 2 years ago

Yeah this growth is really unfortunate. We just spent a lot of effort together with the simpcomp maintainer to slim down their package, and this is all eaten up (and more) by the growth of FactInt, for a change that I think is not that useful for most GAP users (no offense intended! I am sure it's very useful for something, just not for "most").

Some other package download such data only on demand, perhaps FactInt could do something like this?

And of course we may need the whole concept of our package distribution...

hulpke commented 2 years ago

The same issue is avoided in transgrp by providing groups of degree 32 and 48 only as optional download. I think the same would be sensible here. Storage space still is an issue for users in a a managed (ie.e. a sysadmin who needs to approve) setting.

frankluebeck commented 2 years ago

Above I have made one proposal to reduce the distribution size (to about 13% of the discussed growth) by only including factors where the base is a prime.

Another possibility: do not distribute any new factors, and instead add and document a utility function like

UpdateBrentFactors := function()
  FetchBrentFactors(
       "https://maths-people.anu.edu.au/~brent/ftp/factors/factors.gz",
       false);
  FetchBrentFactors(
     "http://myfactorcollection.mooo.com:8090/brentdata/Aug4_2021/factors.gz",
     true);
end;

(the date part of the latter call must be updated from time to time) This function must be called by a user with write permissions in the directory <FactIntDir>/tables/brent/, it only works on UNIX, and needs the external program curl.

PS: I think "is not that useful for most GAP users" is not a valid criterion, as long as "most GAP users" does not mean "all GAP users", because this holds for almost everthing we distribute with GAP.

olexandr-konovalov commented 2 years ago

Ok, so I am delaying the release until this will be resolved, in one way or another.

olexandr-konovalov commented 1 year ago

It would be nice to tackle the issue of the large increase of the FactInt download. I would not like to make a release until then.

frankluebeck commented 1 year ago

A decision is needed about a default set of factors distributed with the package. This could be just a set distributed with former versions of FactInt. Conceptually nicer: define and document a set of bases x and maximal exponents k such that all known factors of x^i - 1 for i <= k are included in the distribution. Of course, this would need to be updated from time to time.

The database of factors can be extended to all known factors of numbers of this kind by the following function:

LoadPackage("StandardFF");
UpdateBrentFactors := function()
  FetchMoreFactors(
       "https://maths-people.anu.edu.au/~brent/ftp/factors/factors.gz",
       false);
  FetchMoreFactors(
     "http://myfactorcollection.mooo.com:8090/brentdata/Dec31_2022/factors.gz",
     true);
end;

Instead of FetchMoreFactors from the StandardFF package one could use the improved FetchBrentFactors from b1621ab.

olexandr-konovalov commented 1 year ago

I would go for an additional set to download from FactInt page and instructions how to install it. Downloading from URLs of others where we don't have any control is not sustainable in the long run.