Closed tschaffter closed 3 years ago
The dashboard will be developed in this repository.
@jiaxinmachine88 Could you please create a document to track what we want to go in this participation dashboard? Maybe a google docs with a section for each metrics. Each section should provide at least the following information:
Tools
)Number of unique tools
)From submission X, count the number of submissions with a unique tool name
)I'm going with these assumptions for now, feel free to correct:
"Number of unique submitters"
unique(submitterid)
"Number of tasks / benchmarks open"
Could you elaborate? Something to do with the status column?
"Latest version of the NLP Sandbox schemas"
Could you elaborate?
"Number of datasets / data sites"
unique(dataset_name) Could you elaborate on data sites?
"Number of unique tools (e.g. using the Tool.name)"
unique(tool_name)
"Programming languages used by the tools (would need to add Tool.language)"
I assume this will be added?
@andrewelamb
Number of tasks / benchmarks open
We could use the number of unique evaluation ID listed in this table (left side).
Latest version of the NLP Sandbox schemas
The services and tools of the NLP Sandbox are based on the NLP Sandbox Schemas. The latest version number could be retrieved from GitHub API. Alternatively, I'm OK adding a file .nlpsandbox-version
to this repo with the schemas version x.y.z
. Here is an example of this file hosted in another GH repo.
"Number of datasets / data sites"
When an NLP developer submits a tool, this tool is evaluated on data hosted at different physical location (data sites). Currently there are two data sites enabled (Sage and Medical College of Wisconsin (MCW)). For now we can use a static value (2
).
The number of datasets can be obtained from the above table as length(unique(dataset_name))
.
"Programming languages used by the tools (would need to add Tool.language)" I assume this will be added?
Yes
@andrewelamb The number of tasks open (.i.e. evaluation queues) is incorrect. I could 6 evaluations queues on the table page. The rest looks good!
@tschaffter Yep, I was grabbing the wrong column. Thta's been fixed.
@tschaffter Would it be all right to use @thomasyu888's credentials for the docker image? He would need read permission to the source table and edit permission to the directory we would want to store the html output file.
@andrewelamb Where is the Docker container going to run? We have a bot account that we can use for this task. I'll create the token and share it with you using our favorite password manager.
@tschaffter, I don't have an answer to either of those questions. :) @thomasyu888 ?
@tschaffter It will run on our kubernetes cluster and it will need to be a Synapse PAT with download permissions.
Sounds good!
Closing this issue in favor of smaller issues to be created in repository of the dashboard: https://github.com/nlpsandbox/participation-dashboard
Create a dashboard that includes statistics about the tools evaluated in the NLP Sandbox. Example of metrics include:
General stats:
Tool.name
)Tool.language
)Notes:
References