Closed samreid closed 9 years ago
If we choose a style of JSON that matches the 3rd party code contributions, it will be a good step toward #180 unifying code/art 3rd party support.
I'll have to switch to another task for a bit, but thought I'd leave some notes here. For converting txt=>json, I was writing Java code in AnnotationParser.java (from our svn) like so:
public static void visit( File file ) {
if ( file.isDirectory() ) {
File[] fileList = file.listFiles();
for ( File file1 : fileList ) {
visit( file1 );
}
}
else {
if ( file.getName().equals( "license.txt" ) ) {
System.out.println( "Found license.txt file at: " + file.getAbsolutePath() );
try {
String s = FileUtils.loadFileAsString( file );
StringTokenizer st = new StringTokenizer( s,"\n" );
}
catch( IOException e ) {
e.printStackTrace();
}
}
}
}
public static void main( String[] args ) {
Annotation a = AnnotationParser.parse( "test-id name=my name age=3 timestamp=dec 13, 2008" );
System.out.println( "a = " + a );
File root = new File( "/Users/samreid/github" );
visit( root );
}
Obviously we'll need to add a bit more code here :smiley:
Incremental progress in https://phet.unfuddle.com/a#/projects/9404/repositories/23262/commit?commit=74402
I'm going to temporarily disable the plugin license.txt requirements while I am incrementally porting TXT => JSON
To maintain history, it looks like I will need to do this as a 2-step process, converting the contents of the file before changing the filename.
I'll also want to make sure the process for handling the JSON works well before getting too far so I can make sure we won't require systemic changes after this batch conversion.
After I'm done here, I should double check the projectURL and text fields for 3rd party images & audio.
EDIT: done
I've finished converting license.txt to license.json. A summary of what was done:
Reused JSON schema from the 3rd party code contributions, so they will match. For example, here is an annotated public domain image:
"cement-texture-dark.jpg": {
"text": [
"Public Domain"
],
"projectURL": "http://www.public-domain-image.com/full-image/textures-and-patterns-public-domain-images-pictures/concrete-texture-public-domain-images-pictures/cement-texture.jpg-royalty-free-stock-image.html",
"license": "Public Domain",
"notes": ""
}
and here is an annotated PhET image:
"explore-icon.png": {
"text": [
"Copyright 2002-2015 University of Colorado Boulder"
],
"projectURL": "http://phet.colorado.edu",
"license": "contact phethelp@colorado.edu",
"notes": "created by John Blanco"
}
The text states the copyright, if any. The projectURL is where the image/audio was obtained from. If it is a PhET image/audio, this must read http://phet.colorado.edu The license specifies the license. Notes gives any additional or helpful information, such as whether the image was modified from another source, who created it, etc.
I think the only thing left to do before closing this issue is to run grunt-all.sh and see what problems come up.
I ran grunt-all.sh build-no-lint and saw related errors in:
john travoltage making-tens build-a-molecule
I resolved the John Travoltage issue and the Snow Day Math issue (see https://github.com/phetsims/making-tens/issues/25). @jonathanolson still needs to comment on the Build a Molecule issue but this is tracked in https://github.com/phetsims/build-a-molecule/issues/75 so I'm ready to close this issue.
Reopening.
The description of fields in the first comment of this issue says:
notes: Optional for all entries, misc notes about the resource.
If notes
is options, then why do we have 111 occurrences of "notes": ""
in license.json files?
If notes
is optional, then the implementation doesn't reflect that. Specifically line 149 of createThirdPartyReport.js:
var lines = [
'**' + library + '**',
json[ library ].text.join( '<br>' ),
json[ library ].projectURL,
'License: [' + json[ library ].license + '](licenses/' + library + '.txt)',
'Notes: ' + json[ library ].notes
];
Where are the fields in license.json documented (other than in the first comment of this issue)? Should they should be documented in the header comment of createThirdPartyReport.js? Or some other grunt taks?
I decided to make "notes" required to simplify the json schema and to encourage people to add notes for new images & audio.
The fields are documented in https://github.com/phetsims/simula-rasa/blob/master/images/README.txt
I chose that location since it is the template that ends up "creating" most of the images directories. Where do you recommend to put this information?
Image files belong in this directory. Each image must have an entry in license.json which indicates the origin of the image as well as its licensing. If this directory has subdirectories, each subdirectory mut have its own license.json file.
The license.json file should contains one entry per file, and each should be annotated with the following:
For an example, please see any of the license.json files in a PhET simulation's image directory.
Where do you recommend to put this information?
Recommended to put it in the grunt task that reads it. See for example setThirdPartyLicenses.js, which describes the format of sherpa/lib/license.json.
The primary file that uses it in chipper is getLicenseInfo.js, and there is already a reference to https://github.com/phetsims/simula-rasa/blob/master/images/README.txt in there:
/*
* The classification is one of: missing-license.json, not-annotated, phet or third-party
* isProblematic indicates whether the particular license is compatible with PhET's licensing
* entry: the object that appears in the license.json file, see
* https://github.com/phetsims/simula-rasa/blob/master/images/README.txt
*/
getLicenseInfo is not a grunt task, but a utility called by createImageAndAudioLicenseReport, and also used by createSimSpecificThirdPartyReport. I'm not a fan of duplicating this documentation in >1 place, can you help me determine where it should live specifically?
Problems with relying on documentation in simula-rasa/images/README.md:
(1) The description of license.json pertains to all media types, not just images.
(2) Suppressing the propagation of README.md (https://github.com/phetsims/simula-rasa/issues/5) is additional work.
(3) Until propagation of the REAMDE.md is suppressed, copies of this file will proliferate - effectively duplicating the documentation.
(4) I would be unlikely to go looking for this info in simula-rasa/images/README.md. If that file needs to be referenced in the getLicenseInfo.js documentation, why not just put the documentation in getLicenseInfo.js? That would eliminate problems (1), (2) and (3) above.
I deleted simula-rasa/images/README.md in https://github.com/phetsims/simula-rasa/issues/5 and moved the documentation to getLicensingInfo.js in e4e9f2eee42ed0d17ef0fd7e88bf7f500a2d11f5
@pixelzoom can you take a look at your convenience?
:+1: Closing.
I tweaked a few things in the license.json doc, @samreid review if you'd like.
The tweaks look good to me.
From https://github.com/phetsims/tasks/issues/274, @pixelzoom suggested converting license.txt to JSON:
Design (15+ hours)
File conversion (10-20 hours)
Proposed format for
license.json
, example:Description of fields:
Required for all entries:
source
: Tells us whether the resource is owned by "PhET" or a 3rd-party. For 3rd-parties, this field identifies the organization or individual(s) who own the resource.Required for all entries where source === "PhET":
author
: Identifies the individual(s) that created the resource.Required for all entries where source !== "PhET":
url
: The URL of the organization or individual(s) that own the resource. If there is no URL, put "none".license
: Identifies the official name of the license under which PhET is using the resource, e.g., "The MIT License".licenseURL
: URL to the specific license. If there is no URL, put "none".Optional for all entries:
notes
: Optional for all entries, misc notes about the resource.