sdruskat / cff-in-the-wild

Analysis of Citation File Format files on GitHub
Apache License 2.0
2 stars 0 forks source link

How many CFF files are valid? #7

Open sdruskat opened 2 years ago

sdruskat commented 2 years ago

This could be done by reusing cffconvert to validate the files (and record error messages).

Bisaloo commented 2 years ago

I'm going to use https://docs.ropensci.org/jsonvalidate/, which gives clear error messages.

It could still be useful to compare with the output of another validator though. Hopefully, the results should be the same.

amal-ghamdi commented 2 years ago

I can add myself in visualization

pie_1

amal-ghamdi commented 2 years ago

some updates validity

Bisaloo commented 2 years ago

I think we have two levels of 'invalidity':

@amal-ghamdi, what do you think of having two hues of 'invalid' to distinguish between these two types?

amal-ghamdi commented 2 years ago

yes makes sense, this should also be reflected in the data @samharrison7 provides as well

Bisaloo commented 2 years ago

On my end, out of 3307 files, I can parse 3146 and 1804 of them are valid CFF files. I can provide a script later if these results differ from what @samharrison7 finds :slightly_smiling_face:

amal-ghamdi commented 2 years ago

Got it!