googlefonts / fontbakery-dashboard

A library-scale web dashboard for Font Bakery, no longer developed
Apache License 2.0
28 stars 10 forks source link

[googlefonts/upstream/spreadsheet] Duplicate rows etc. #96

Open graphicore opened 6 years ago

graphicore commented 6 years ago

I have some rather open remarks with the request for comments. I want to understand the data in the Google Fonts Upstream Repos better in order to use it correctly.

We have some duplicate rows in that spreadsheet:

Now, also the first sheet in Google Fonts Upstream Repos says that only the statuses OK and NOTE are "enabled on the dashboard". Does this also apply to none-dashboard fontbakery?

If we only look at rows with status OK and NOTE duplicates are:

[ 'NOTE', 'Cutive Mono', 'Passed', 'https://github.com/googlefonts/CutiveFont.git', 'fonts/CutiveMono-', 'sfd=5 | glyphs=1 | ufo=3', 'CutiveMono.glyphs=None', 'Monospace', 1209375, 42000, 'There\'s also an old Cultive Roman on the upstream repo at: https://github.com/vernnobile/CutiveFont/tree/master/CutiveRoman', '' ]

* Libre Caslon Text
```js
[ 'OK',
  'Libre Caslon Text',
  'Passed',
  'https://github.com/impallari/Libre-Caslon-Text',
  '/fonts/TTF/LibreCaslonText-',
  'vfb=9',
  '',
  '',
  '',
  '',
  '',
  '',

[ 'OK',
  'Libre Caslon Text',
  'Passed',
  'https://github.com/impallari/Libre-Caslon-Display',
  '/fonts/TTF/LibreCaslonDisplay-',
  'vfb=3',
  '',
  '',
  '',
  '',
  '',
  '' 
]

[ 'OK', 'Aleo', 'Passed', 'https://github.com/AlessioLaiso/aleo', 'fonts/ttf/Aleo-', 'glyphs=8 | ufo=6', 'Aleo-Italic.glyphs=weight | Aleo-Bold.glyphs=None | Aleo-Light.glyphs=None | Aleo-BoldItalic.glyphs=None | Aleo.glyphs=weight', '', '', '', '', '' ]


There are also interesting statuses, maybe typos:

* `NOE`,  `Over the Rainbow`
* `NOTe`, `Slabo 13px`

If these are meant to be "`NOTE`" is should be fixed.

Maybe off-topic (since this should only look at `OK` and `NOTE`): for rows with a status of `RENAMED` there seems to be no information about the old name, e.g. : 

```js
[ 'RENAMED',
  'Mr Bedfort',
  'Passed',
  '',
  '',
  '',
  '',
  'Handwriting',
  39997,
  29000,
  'This project was renamed Mr Bedfort',
  ''
 ]
m4rc1e commented 6 years ago

This doc is a combo of human input and some scripting. We had two lovely people help us track down the upstream repositories, this was an arduous task so I thank them for this.

I'll do a quick tidyup now.

davelab6 commented 6 years ago
  'Libre Caslon Text',
  'Passed',
  'https://github.com/impallari/Libre-Caslon-Display',

This is a bug; the repo URL shows that the family name is wrong, s/Text/Display

  'Cutive Mono',
  'Passed',
  'https://bitbucket.org/lassefister/old-googlefontdirectory/src/21142f3bf7ad39d89c1c682d30830494ef1c905c/ofl/cutivemono/?at=default',

This is an obsolete source location and can be discarded

Aleo looks like an identical duplicate so drop one copy, and same for the others Lasse mentioned but didn't get into details

m4rc1e commented 6 years ago

Alright, I've fixed the issues you've raised. @davelab6 One of the duplicates for Libre Caslon I've renamed to the Display family :-)

Just as a side note, we had two lovely people help us track down and find a lot of upstream repos for us. This was quite an undertaking so we should be glad they invested the time to do this.

graphicore commented 6 years ago

Should we add some mild self checks when reading the data.:

Should we use something shorter for 'family name is confirmed as good?'? I can work with it, but it's very verbose ;-)

graphicore commented 6 years ago

FIY: about family name is confirmed as good? that row is going to be used for a replacement of:

com.google.fonts/check/165 Familyname must be unique according to namecheck.fontdata.com

m4rc1e commented 6 years ago

@graphicore When I check the doc history, this column was added by @davelab6.

graphicore commented 6 years ago

I can confirm that the duplicates for OK and NOTE are gone and the status typos NOTe and NOE are gone as well.

When I check the doc history …

I was sitting next to him ;-) see my FIY comment above.

m4rc1e commented 6 years ago

ahaha ok, I thought you were assuming that's what its usage was going to be :-).

Let me know if you need anything else.

graphicore commented 6 years ago

Here's more:

I'm looking only at git upstreams with status OK or NOTE now.

Generally, for upstream, if it is a git repository I'd like to see the url end with .git. For https://github.com/... I can infer this though.

Further, if the branch is not master we could write it like https://github.com/zeynepakay/Rakkas.git:gh-pages or alternatively add a further column git-branch which defaults to "master".

Concrete problems at the moment:

Bad upstream url:

SSL error

I'm not sure about these. Maybe it's GitHub punishing me for doing many unauthenticated requests, investigating:

Looks like:

Error: SSL error: error:140E0197:SSL routines:SSL_shutdown:shutdown while in init
    at Error (native) errno: -20 

This is for documentation purpose my stderr output that I used to make this report:

ERROR failed _fetchRef remoteUrl: https://github.com/vernnobile/antonioFont/.git remoteName: upstream/Antonio referenceName: master { Error: unexpected HTTP status code: 404
    at Error (native) errno: -1 }
ERROR failed _fetchRef remoteUrl: https://github.com/googlefonts/elsiefont.git remoteName: upstream/Elsie referenceName: master { Error: authentication required but no callback set
    at Error (native) errno: -1 }
ERROR failed _fetchRef remoteUrl: https://github.com/blipvert/googlefontdirectory-hg/tree/master/ofl/esteban.git remoteName: upstream/Esteban referenceName: master { Error: unexpected HTTP status code: 404
    at Error (native) errno: -1 }
ERROR failed _fetchRef remoteUrl: https://github.com/googlefonts/HipHopFont.git remoteName: upstream/Hip_Hop_Text referenceName: master { Error: authentication required but no callback set
    at Error (native) errno: -1 }
ERROR failed _fetchRef remoteUrl: https://github.com/googlefonts/LiterataFont.git remoteName: upstream/Literata referenceName: master { Error: authentication required but no callback set
    at Error (native) errno: -1 }
ERROR failed _fetchRef remoteUrl: https://github.com/vernnobile/ShantiFont/tree/master/FINAL.git remoteName: upstream/Shanti referenceName: master { Error: unexpected HTTP status code: 404
    at Error (native) errno: -1 }
ERROR upstream: FAILED: Fetching remote (branch)"upstream/Bangers:master" at url: https://github.com/googlefonts/bangers.git
ERROR failed _fetchRef remoteUrl: https://github.com/googlefonts/bangers.git remoteName: upstream/Bangers referenceName: master { Error: no reference found for shorthand 'upstream/Bangers/master'
    at Error (native) errno: -3 }
ERROR upstream: FAILED: Fetching remote (branch)"upstream/Rakkas:master" at url: https://github.com/zeynepakay/Rakkas.git
ERROR failed _fetchRef remoteUrl: https://github.com/zeynepakay/Rakkas.git remoteName: upstream/Rakkas referenceName: master { Error: no reference found for shorthand 'upstream/Rakkas/master'
    at Error (native) errno: -3 }
ERROR failed _fetchRef remoteUrl: https://github.com/googlefonts/HipHopFont.git remoteName: upstream/Hip_Hop_Display referenceName: master { Error: authentication required but no callback set
    at Error (native) errno: -1 }
ERROR failed _fetchRef remoteUrl: https://github.com/CatharsisFonts/Cormorant.git remoteName: upstream/Cormorant referenceName: master { Error: SSL error: error:140E0197:SSL routines:SSL_shutdown:shutdown while in init
    at Error (native) errno: -20 }
ERROR failed _fetchRef remoteUrl: https://github.com/clauseggers/Inknut-Antiqua.git remoteName: upstream/Inknut_Antiqua referenceName: master { Error: SSL error: error:140E0197:SSL routines:SSL_shutdown:shutdown while in init
    at Error (native) errno: -20 }
ERROR failed _fetchRef remoteUrl: https://github.com/m4rc1e/PatrickHandSC.git remoteName: upstream/Patrick_Hand_SC referenceName: master { Error: SSL error: error:140E0197:SSL routines:SSL_shutdown:shutdown while in init
    at Error (native) errno: -20 }
ERROR failed _fetchRef remoteUrl: https://github.com/theleagueof/prociono.git remoteName: upstream/Prociono referenceName: master { Error: SSL error: error:140E0197:SSL routines:SSL_shutdown:shutdown while in init
    at Error (native) errno: -20 }
ERROR failed _fetchRef remoteUrl: https://github.com/Omnibus-Type/Sansita.git remoteName: upstream/Sansita referenceName: master { Error: SSL error: error:140E0197:SSL routines:SSL_shutdown:shutdown while in init
    at Error (native) errno: -20 }
ERROR failed _fetchRef remoteUrl: https://github.com/Omnibus-Type/PragatiNarrow.git remoteName: upstream/Pragati_Narrow referenceName: master { Error: SSL error: error:140E0197:SSL routines:SSL_shutdown:shutdown while in init
    at Error (native) errno: -20 }
ERROR failed _fetchRef remoteUrl: https://github.com/googlefonts/rubik.git remoteName: upstream/Rubik referenceName: master { Error: SSL error: error:140E0197:SSL routines:SSL_shutdown:shutdown while in init
    at Error (native) errno: -20 }
ERROR failed _fetchRef remoteUrl: https://github.com/googlefonts/MuktaGFVersion.git remoteName: upstream/Ek-Mukta_MuktaVaani_Gujarati referenceName: master { Error: SSL error: error:140E0197:SSL routines:SSL_shutdown:shutdown while in init
    at Error (native) errno: -20 }
ERROR failed _fetchRef remoteUrl: https://github.com/googlefonts/MooniakFontsGFVersion.git remoteName: upstream/PostNoBills_Colombo referenceName: master { Error: SSL error: error:140E0197:SSL routines:SSL_shutdown:shutdown while in init
    at Error (native) errno: -20 }
graphicore commented 6 years ago

Don't mind the SSL error, it's my problem.

It seems like my suspicion was correct. I was able to fetch all these repos with my code, when not trying to fetch everything else. Also I found a comment (kytrinyx, GitHub Staff):

There are no hard rate limits on cloning, so you are free to clone as much as you’d like. Still, we’d like to ask you to clone at a reasonable pace. Cloning a few (2-3-4) repositories in parallel is okay, cloning a 100 repositories in parallel is not and can be detected as abusive behavior by our automated measures.

davelab6 commented 6 years ago

FIY: about family name is confirmed as good? that row is going to be used for a replacement of:

com.google.fonts/check/165 Familyname must be unique according to namecheck.fontdata.com

No; check 165 is meant to help inform that human decision, but some checks can only be executed by humans, like a final decision if the name is good. However, their results need to be stored somewhere. Currently that is this spreadsheet; but long term, probably we can make it some text file on Github (like how https://katydecorah.com/font-library works, like https://github.com/katydecorah/font-library/blob/gh-pages/CONTRIBUTING.md)

davelab6 commented 6 years ago

Generally, for upstream, if it is a git repository I'd like to see the url end with .git. For https://github.com/... I can infer this though.

If you visit https://github.com/googlefonts/literata.git in a browser then it will do the right thing, and likewise if you clone https://github.com/googlefonts/literata it will also work, but yes formally https://github.com/googlefonts/literata.git is the correct URL. I don't mind having that formally correct URL in this sheet.

Further, if the branch is not master we could write it like https://github.com/zeynepakay/Rakkas.git:gh-pages or alternatively add a further column git-branch which defaults to "master".

Another col seems better to me

Concrete problems at the moment:

Bad upstream url:

  • Antonio typo https://github.com/vernnobile/antonioFont/ should be https://github.com/vernnobile/antonioFont

Yes, killing trailing slashes seems like a good idea, although it seems like a WARN rather than a FAIL

  • Bangers unclear (branch?) https://github.com/googlefonts/bangers can't fetch the branch "master", maybe internal to my code.

Worked for me

$ git clone git@github.com:googlefonts/bangers.git
Cloning into 'bangers'...
remote: Counting objects: 297, done.
remote: Total 297 (delta 0), reused 0 (delta 0), pack-reused 297
Receiving objects: 100% (297/297), 1.68 MiB | 13.80 MiB/s, done.
Resolving deltas: 100% (225/225), done.
$
  • Elsie missing https://github.com/googlefonts/elsiefont

https://github.com/googlefonts/elsie exists but is empty

  • Esteban typo https://github.com/blipvert/googlefontdirectory-hg/tree/master/ofl/esteban should be ?

https://github.com/blipvert/googlefontdirectory-hg/tree/master/ofl/esteban loads for me.

  • Hip Hop Display missing https://github.com/googlefonts/HipHopFont
  • Hip Hop Text missing https://github.com/googlefonts/HipHopFont

Those families were finally published as https://fonts.google.com/specimen/Sedgwick+Ave and https://fonts.google.com/specimen/Sedgwick+Ave+Display, and it seems that their source repo was never actually published. I should make that happen!

I see a src repo in Marc's private Bitbucket account that was never published. It is now https://github.com/googlefonts/sedgwickave for both, and they are at https://fonts.google.com/?query=Sedgwick+Ave

The "Hip Hop" families should disappear from this Earth. :)

  • Literata missing https://github.com/googlefonts/LiterataFont

Yah, another repo that I didn't publish yet. However, this hasn't launched in the API, so should be held back. I made the repo (actually renamed it to https://github.com/googlefonts/literata to be cleaner :) so at last the repo existing check should pass.

  • Rakkas branch https://github.com/zeynepakay/Rakkas should be https://github.com/zeynepakay/Rakkas.git:gh-pages

Yes. In fact a lot of projects should probably have gh-pages as their default branch ;)

  • Shanti typo https://github.com/vernnobile/ShantiFont/tree/master/FINAL should be https://github.com/vernnobile/ShantiFont and fontfiles prefix should maybe be: "FINAL/shanti-"

Yes

davelab6 commented 6 years ago

Should we add some mild self checks when reading the data.:

Yes! Although since it is a spreadsheet, the sheet can itself have "input validation" (https://www.youtube.com/watch?v=8YTuhT3rbPI)

graphicore commented 6 years ago

No; check 165 is meant to help inform that human decision, but some checks can only be executed by humans, like a final decision if the name is good.

That check is a pain though, it's really really broken. Since we don't get rid of it, we'll have to repair it.

graphicore commented 6 years ago

If you visit https://github.com/googlefonts/literata.git in a browser then it will do the right thing, and likewise if you clone https://github.com/googlefonts/literata it will also work,

That's beside the point, the point is that we have no indicator of the version control system used in the upstream, but we have mixed upstream VCS in that spreaheet (hg and git). Adding .git makes it explicit and easy for to dertmine how to fetch the files. https://github.com works also as an indicator, but it's possible to host git repositories not on GitHub.

graphicore commented 6 years ago

Another col seems better to me

"git-branch" ?

davelab6 commented 6 years ago

A col for "branch" and a col for "vcs" seems fine to me, and if they are blank then we can assume git and master

graphicore commented 6 years ago

Bad upstream url:

Antonio typo https://github.com/vernnobile/antonioFont/ should be https://github.com/vernnobile/antonioFont

although it seems like a WARN rather than a FAIL

This is before fontbakery is invoked. We just fail fetching the files if the upstream info is bad. We could somehow make fontbakery report stuff like this though. You got to task me explicitly for this, I wouldn't just built it in, seems like a bigger detour.

The question for me is rather, how many input faux pas should be fixed in the code and how many should be fixed in the input data. Fixing a row in the spreadsheet is much faster than making the consuming code forgiving each possible or observed input data failure (although it's possible).

Bangers unclear (branch?) https://github.com/googlefonts/bangers can't fetch the branch "master", maybe internal to my code.

Worked for me

git clone Works for me too. However, the code of the source implementation (using NodeGit library) has problems. I'll postpone this to later.

Elsie missing https://github.com/googlefonts/elsiefont

https://github.com/googlefonts/elsie exists but is empty

Doesn't exist for me/unauthorized:

selection_206

Should we add some mild self checks when reading the data.:

Yes! Although since it is a spreadsheet, the sheet can itself have "input validation"

Thus, we should just fail hard and early and have the spreadsheet input validation inplace to prevent it?

davelab6 commented 6 years ago

Yes, I don't think time spent making the sheet input code more robust is good. Let's just fix the sheet

On Mon, May 14, 2018, 9:03 AM Lasse Fister notifications@github.com wrote:

Bad upstream url:

Antonio typo https://github.com/vernnobile/antonioFont/ should be https://github.com/vernnobile/antonioFont

although it seems like a WARN rather than a FAIL

This is before fontbakery is invoked. We just fail fetching the files if the upstream info is bad. We could somehow make fontbakery report stuff like this though. You got to task me explicitly for this, I wouldn't just built it in, seems like a bigger detour.

The question for me is rather, how many input faux pas should be fixed in the code and how many should be fixed in the input data. Fixing a row in the spreadsheet is much faster than making the consuming code forgiving each possible or observed input data failure (although it's possible).

Bangers unclear (branch?) https://github.com/googlefonts/bangers can't fetch the branch "master", maybe internal to my code.

Worked for me

git clone Works for me too. However, the code of the source implementation (using NodeGit library) has problems. I'll postpone this to later.

Elsie missing https://github.com/googlefonts/elsiefont

https://github.com/googlefonts/elsie exists but is empty

Doesn't exist for me/unauthorized:

[image: selection_206] https://user-images.githubusercontent.com/393132/40008747-5e427396-57a0-11e8-8406-eed2c17483eb.png

Should we add some mild self checks when reading the data.:

Yes! Although since it is a spreadsheet, the sheet can itself have "input validation"

Thus, we should just fail hard and early and have the spreadsheet input validation inplace to prevent it?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/googlefonts/fontbakery/issues/1848#issuecomment-388871025, or mute the thread https://github.com/notifications/unsubscribe-auth/AAP9yzfutJlMlcJvzo7c5CJcbo43bxxTks5tyarVgaJpZM4TyprF .

graphicore commented 6 years ago

About the SSL-erros, I said:

SSL error

I'm not sure about these. Maybe it's GitHub punishing me for doing many unauthenticated requests, investigating:

Good news is, we are not alone on these:

https://github.com/libgit2/libgit2/issues/4644 https://github.com/nodegit/nodegit/issues/1495

felipesanches commented 5 years ago

Can we close this? Or maybe move it to the dashboard issue tracker if appropriate?

davelab6 commented 5 years ago

Transferred