GALAglobal / TAPICC-API-implementation

TAPICC API implementation using node.js framework sails.js
Other
6 stars 1 forks source link

downloading assets of a downloadable deliverable #26

Closed assembledStarDust closed 6 years ago

assembledStarDust commented 6 years ago

How would I gain access to the assets of a deliverable when downloading using the following url.

/job/{parentid}/task/{id}/downloaddeliverable

For example, while the deliverable might be an xliff file, what about the TM file, terminology, reference files etc. Are there going to be any reference id's I can use in the downloaddeliverable return?

Alino commented 6 years ago

I am not sure I understood correctly,

to download an asset GET /job/{parentid}/asset/{id}/downloadfile

to download a task deliverable GET /job/{parentid}/task/{id}/downloaddeliverable

Does that answer your question?

assembledStarDust commented 6 years ago

With a typical translation kit, its made up of several parts. Reference material, termbase, TM. The TM for the kit might be a subset of the TMS TM, holding only relevant candidate matches. While I get that this might become an asset, what link do I have to link the subset TM to the download kit.

I'd be expecting that within the return of the downloaddeliverable I'd have a list of asset id's that relate to the translation kit.

hope that's more clear.

Alino commented 6 years ago

sorry I don't know what is TM, TMS.

But, can we classify termbase, TM, TMS as references that should be associated with an Asset?

If the answer is yes, then I propose to get rid of Asset.isReference field, and create a new data model called 'Reference'. It would be a child of an Asset. So whenever you would retrieve an Asset, you would also retrieve all it's References, so that you can get list of ids, that you can download too.

example:

GET /job/1/asset/5

response:

{
  id: 5,
  sourceLanguage: 'en'
  jobId: 1,
  tasks: [{ id: 14, targetLanguage: 'sk', type: 'translation' }],
  references: [
    { id: 23, assetId: 5 },
    { id: 24, assetId: 5 }
  ]
}

now you would know there are 2 references with ids 23 and 24 so next you could do:

GET /job/1/asset/5/reference/23/downloadfile GET /job/1/asset/5/reference/24/downloadfile

EDIT: you can also see Task ids here, there is one with id 14. so you could also download a Task's deliverable with

GET /asset/5/task/14/downloaddeliverable

I have created a separate issue for this proposal -> #32

assembledStarDust commented 6 years ago

agree with getting rid of Assent.isReference. suggest add Asset.assetType values might be uploadedDeliverable, downloadableDeliverable, content, reference, TM, termbase.

also suggest add optional path to the Assent.path as string

add assets associated with a task.

dump calls /job/{parentid}/task/{id}/uploaddeliverable Upload deliverable file /job/{parentid}/task/{id}/downloaddeliverable Download deliverable file

enhance call /job/{parentid}/task/{id} Get a Task add assets.

{
  "id": 0,
  "type": "translation",
  "targetLanguage": "string",
  "assetId": 0,
  "progress": "pending",
  "assignedTo": 0,
  "file": "string",
  "jobId": {
    "id": 0,
    "name": "string",
    "description": "string",
    "submitDate": "2018-07-30T14:09:44.744Z",
    "dueDate": "2018-07-30T14:09:44.744Z",
    "closedDate": "2018-07-30T14:09:44.744Z",
    "submitter": 0,
    "externalId": "string",
    "createdAt": "2018-07-30T14:09:44.744Z",
    "updatedAt": "2018-07-30T14:09:44.744Z"
  },
  "createdAt": "2018-07-30T14:09:44.744Z",
  "updatedAt": "2018-07-30T14:09:44.744Z"
  "assets":["assetId":0,"assetId":0.....]
}

enhance call /job/{parentid}/asset/uploadfile

add taskId, remove reference. taskId can be 0 to denote no specific task. add field assetType. values might be uploadedDeliverable, downloadableDeliverable, content, reference, TM, termbase, clientDeliverable.

{
  "id": 0,
  "taskID": int,
  "assetType": string,
  "file": "string",
  "sourceLanguage": "string",
  "encoding": "string",
  "jobId": {
    "id": 0,
    "name": "string",
    "description": "string",
    "submitDate": "2018-07-30T14:17:17.932Z",
    "dueDate": "2018-07-30T14:17:17.932Z",
    "closedDate": "2018-07-30T14:17:17.932Z",
    "submitter": 0,
    "externalId": "string",
    "createdAt": "2018-07-30T14:17:17.932Z",
    "updatedAt": "2018-07-30T14:17:17.932Z"
  },
  "createdAt": "2018-07-30T14:17:17.932Z",
  "updatedAt": "2018-07-30T14:17:17.932Z"
}
assembledStarDust commented 6 years ago

I see that this is kind of similar question

https://github.com/GALAglobal/TAPICC-API-implementation/issues/18

Alino commented 6 years ago

I understand you would like Asset.path added for a location of the file on the server. That's a possibility how this could be done. Currently we are storing the files as binaries into the database. What are the benefits of changing this?


can you please describe Asset.assetType? What does each of the value represent and how should they work? ( uploadedDeliverable, downloadableDeliverable, content, reference, TM, termbase.)

I am especially interested if we can classify TM and termbase as reference, or are these separate concept?


Why should we remove these calls? How else would you want to upload deliverables into Tasks? /job/{parentid}/task/{id}/uploaddeliverable (there would be no way to upload file into Task) /job/{parentid}/task/{id}/downloaddeliverable (this would make sense if we adopt Asset.path)


UPDATE: I just realised, you suggest to add array of Assets into a Task. That seems confusing to me. It makes more sense to me to make Tasks children of an Asset. As in issue #18


wouldn't my previous post with my proposal, solve the problem in your original question?

assembledStarDust commented 6 years ago

Asset.path would be a reference for the sender of the file. A particular file might be referenced by a path on the client server. Uploading as an asset loses this info. When pulling back a final translated file, there could be benefit in knowing the original path it came from.

Asset.assetType "uploadedDeliverable". xliff file or other translation transport file that has been completed by vendor. "downloadableDeliverable" xliff or other translation transport file that is ready for translation by vendor. "content". source content file "reference" reference material for vendor to assist translation context. "termbase" subset of the system termbase that would hold only terms relating to the particular content. "translation memory" subset of the system translation memory that would hold candidate matches for the vendor.

removing uploadDeliverable etc as they are superfluous. A deliverable is an asset, right? Upload it as one. Tag it as what it is. Tie it to the task. Minimizes number of calls.

Yes, the solution in your previous post would solve the problem. You would need to download the task, find the single asset id, then download the asset and find the asset id's references and download those. However, I wonder if that's limiting, as there isn't always a one to one relationship to task to asset. There could be many assets associated with a task.

While I get that the single asset in the task might be a payload that consists of many parts itself, I wonder if that's limiting in the spec.

Alino commented 6 years ago

Asset.path would be a reference for the sender of the file. A particular file might be referenced by a path on the client server. Uploading as an asset loses this info. When pulling back a final translated file, there could be benefit in knowing the original path it came from.

Agree this could be useful, would you like to create a separate issue for this?

uploadedDeliverable". xliff file or other translation transport file that has been completed by vendor

This is what we currently store inside a Task, not inside an Asset.

"downloadableDeliverable" xliff or other translation transport file that is ready for translation by vendor.

This is what I understand to be an Asset's file (currently Asset.file)

"content". source content file

What is the difference between "content" and "downloadableDeliverable" My thinking is both are source files ready to be worked on.

"termbase" subset of the system termbase that would hold only terms relating to the particular content.

is this somehow actionable? Would someone want to perform a task on termbase? Like translation? OR is this just a reference material?

"translation memory" subset of the system translation memory that would hold candidate matches for the vendor.

Same question as for "termbase"


A deliverable is an asset, right? Upload it as one. Tag it as what it is. Tie it to the task. Minimizes number of calls.

A deliverable is an output of a finished Task.

...There could be many assets associated with a task.

That's incorrect. Each Asset can have many associated Tasks. One Task cannot be associated with more than one Asset.

Yes, the solution in your previous post would solve the problem. You would need to download the task, find the single asset id, then download the asset and find the asset id's references and download those.

You could just GET info about an Asset, and it would show you all Reference ids and all Task ids. So you can download everything related to that Asset.

assembledStarDust commented 6 years ago

Agree this could be useful, would you like to create a separate issue for this? complete

assembledStarDust commented 6 years ago

My original question could be handled as a "payload", referring to WG2.

wrt Asset.assetType: Yes, items like Termbase and Translation Memory are references.

What is the difference between "content" and "downloadableDeliverable" My thinking is both are source files ready to be worked on.

No, they are not. I was considering that "content" is the source file from the client. This may need extraction in accordance with WG3 to become a "downloadableDeliverable" asset.

I'm still considering that Asset.isReference is too broad. There may be benefit in separating out source file types, downloadableDeliverables, uploadableDeliverables, Devliverables, reference when creating assets. adding Asset.Description may also be beneficial.

I'll hive off this part of the discussion to a separate issue.

That's incorrect. Each Asset can have many associated Tasks. One Task cannot be associated with more than one Asset.

ah, ok.

Reviewing issue 18 git commit, I'm understanding that there is a one to many relationship asset/task.

I'll consider re-opening issue #18 to continue this conversation.