DataSeer / dataseer-web

DataSeer web application
GNU General Public License v3.0
13 stars 1 forks source link

dataseer-web

Purposes

This repository corresponds to the DataSeer web application, which aims at driving the authors of scientific article/manuscripts to the best research data sharing practices, i.e. to ensure that the datasets coming with an article are associated with data availability statement, permanent identifiers and in general requirements regarding Open Science and reproducibility.

Machine learning techniques are used to extract and structure the information of the scientific article, to identify contexts introducting datasets and finally to classify these context into predicted data types and subtypes. These ML predictions are used by the web application to help the authors to described in an efficient and assisted manner the datasets used in the article and how these data are shared with the scientific community.

See the dataseer-ml repository for the machine learning services used by DataSeer web.

Supported article formats are PDF, docx, TEI, JATS/NLM, ScholarOne, and a large variety of additional publisher native XML formats: BMJ, Elsevier staging format, OUP, PNAS, RSC, Sage, Wiley, etc (see Pub2TEI for the list of native publisher XML format covered).

Contacts and licences

Main authors and contact: Nicolas Kieffer, Patrice Lopez (patrice.lopez@science-miner.com).

The development of dataseer-ml is supported by a Sloan Foundation grant, see here.

dataseer-Web is distributed under Apache2 license.

Description

This appliaction is composed of :

Documents, Organizations and Accounts data are stored in MongoDB. Files (PDF, XML and TEI) uploaded on dataseer-web are stored in the server FileSystem

Documentations

Install

Table of contents

npm i
// NodeJS V16.0

Run

Table of contents

npm run // Display list of available options
npm start // Start headless process with forever (production)
npm start-dev // Start process (development)
npm stop // Stop headless process

Dependencies

Table of contents

Application requires:

Configurations

Web Application Configuration

Table of contents

You must create some configurations files (based on *.default files) and fill them with your data :

JWT Configuration

Table of contents

This application require a private key to create JSON Web Token You must create file conf/private.key and fill it with a random string (a long random string is strongly recommended)

Mails

Table of contents

All the files concerning the mails are in the conf/mails directory.

Data Access

Table of contents

Your role defines which data you can access.