Macaulay2 / M2

The primary source code repository for Macaulay2, a system for computing in commutative algebra, algebraic geometry and related fields.
https://macaulay2.com
330 stars 227 forks source link

Duplicate documentation files #1144

Open d-torrance opened 4 years ago

d-torrance commented 4 years ago

There are 1 MB worth of duplicate documentation files after Macaulay2 is installed:

profzoom@gloria:/usr/share/doc/Macaulay2$ jdupes -rm .      
Scanning: 13448 files, 580 items (in 1 specified)
90 duplicate files (in 84 sets), occupying 1 MB             

(See https://gist.github.com/d-torrance/5e016cfe5e40096efceb1f6034d08535 for a more verbose report.)

This can be fixed by running jdupes -rl on the documentation directory after all the examples are generated. But of course this would require adding jdupes to the list of dependencies for building Macaulay2. Is there interest in doing this?

DanGrayson commented 4 years ago

Another possibility is 'dupmerge':

einsteinium$ echo hi there > a/c/e
einsteinium$ echo hi there > a/b/d 
einsteinium$ ls -liR a
a:
total 0
8643877340 drwxrwxr-x 3 dan wheel 96 May 13 11:11 b
8643877341 drwxrwxr-x 3 dan wheel 96 May 13 11:12 c

a/b:
total 4
8643877349 -rw-rw-r-- 1 dan wheel 9 May 13 11:15 d

a/c:
total 4
8643877374 -rw-rw-r-- 1 dan wheel 9 May 13 11:15 e
einsteinium$ find a -type f -print0 | dupmerge
dupmerge started at 2020-05-13 11:15:57
tmpfile (pointer 0x7fffa2808030) created, processing ...
Input: 2 files, processing ...
ln  a/c/e a/b/d: 1, 1 -> 2, freed +8 blocks
Scanning for more dups ...
Files linked: 1 of 2, Disk blocks reclaimed: 8
Minimum of found hard links: 1, Maximum: 2.
einsteinium$ ls -liR a
a:
total 0
8643877340 drwxrwxr-x 3 dan wheel 96 May 13 11:15 b
8643877341 drwxrwxr-x 3 dan wheel 96 May 13 11:12 c

a/b:
total 4
8643877374 -rw-rw-r-- 2 dan wheel 9 May 13 11:15 d

a/c:
total 4
8643877374 -rw-rw-r-- 2 dan wheel 9 May 13 11:15 e
DanGrayson commented 4 years ago

Brew has both jdupes and fdupes but not dupmerge:

einsteinium$ brew search jdupes
==> Formulae
jdupes
einsteinium$ brew search fdupes
==> Formulae
fdupes
einsteinium$ brew search dupmerge
No formula or cask found for "dupmerge".
DanGrayson commented 4 years ago

Arch Linux has only fdupes:

arch$ pacman --sync --search fdupes
community/fdupes 1:2.0.0-1
    a program for identifying or deleting duplicate files residing within specified
    directories
community/rmlint 2.9.0-2
    Tool to remove duplicates and other lint, being much faster than fdupes
arch$ pacman --sync --search jdupes
arch$ pacman --sync --search dupmerge
mahrud commented 3 years ago

Probably the proper fix is to store the examples in the database files and not distribute any of these output files. What's the point of splitting the output every time help is called?