fatty- / daisy-pipeline

Automatically exported from code.google.com/p/daisy-pipeline
0 stars 0 forks source link

file set type detection #41

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
From Jostein: 
How about a step to detect the type of _fileset_ as well? (DAISY3, DAISY 2.02, 
etc) What would be the "type" of these?

Original issue reported on code.google.com by rdeltour@gmail.com on 16 May 2011 at 2:49

GoogleCodeExporter commented 9 years ago
To be able to filter the list of scripts in the Web UI, we need a way to 
declare what type of filesets a script accepts.

In the "v1.5" branch of the Web UI, JavaScript is used client-side to determine 
the type of fileset that is uploaded. There, a fileset type is defined as 
follows (in JSON):
"SCRIPTID": { type: "TYPE", name: "NAME", requirements: REQUIREMENT* }

where:
SCRIPTID: string
TYPE: string
NAME: string
REQUIREMENT: { fileName: string|RegExp, contentType: string|RegExp, both: 
REQUIREMENT*, either: REQUIREMENT* }

For instance, the DAISY 2.02 fileset type is defined as follows:
"daisy202": {
  type: "multipart/x-daisy202",
  name: "DAISY 2.02",
  requirements: [
    { fileName: new RegExp("(^|/)ncc\\.html$","i") }
  ]
}

...which simply means that if the fileset contains a file named "ncc.html" 
(case insensitive), then it is a DAISY 2.02 fileset.

Current fileset types used in the Web UI:
DAISY 2.02: multipart/x-daisy202
DAISY 3: multipart/x-daisy3
DTBook: application/x-dtbook+xml
ZedAI: application/z3998-auth+xml
EPUB: application/epub+zip - (both zipped and unzipped. no way to determine 
version 2 vs 3 without inspecting the OPF)

Links to discussion:
https://groups.google.com/d/msg/daisy-pipeline-dev/cBP8CcSyBE8/ZqPBna2vJOIJ
https://groups.google.com/d/msg/daisy-pipeline-dev/cBP8CcSyBE8/LfHtu1PxfyQJ

Client-side fileset type declarations:
https://github.com/daisy-consortium/pipeline-webui/commit/2be59842dc3c066d231655
3ba97dece938103615#L7R143

Original comment by josteinaj@gmail.com on 4 Mar 2013 at 9:19

GoogleCodeExporter commented 9 years ago
Typo: "SCRIPTID" should be "FILESET_TYPE_ID". It has nothing to do with the 
script id.

Original comment by josteinaj@gmail.com on 4 Mar 2013 at 9:33

GoogleCodeExporter commented 9 years ago
An XML-representation of the JSON file used by the Web UI can be included in 
mediatype-utils. A new step, px:mediatype-detect-fileset would accept a 
d:fileset and return a <c:result result="{MIME type for fileset}"/>.

(low priority though, we don't really currently need it)

Original comment by josteinaj@gmail.com on 20 Jun 2013 at 3:12