ropensci / jsonld

R wrapper for jsonld.js JavaScript library
https://docs.ropensci.org/jsonld
Other
35 stars 0 forks source link

toJSON function unstable #14

Open Hz-EMW opened 6 years ago

Hz-EMW commented 6 years ago

Dear Professor Ooms: As far as I know, jsonlite is an amazing tool for format conversion. However, the function of data frame transform to json runs abnormal. It could not generate the structure I want. The data frame prepared to converse to json just like this:

> dput(itemld)
structure(list(mdv = structure(1:12, .Label = c("Abstract", "Appears in Collections", 
"Author", "Conference Name", "Conference Place", "Content Type", 
"English Abstract", "Issued Date", "Keyword", "Language", "Title", 
"URI"), class = "factor"), ldv = structure(c(6L, 2L, 8L, 1L, 
3L, 12L, 7L, 10L, 9L, 5L, 11L, 4L), .Label = c("activitystreams:Event", 
"activitystreams:Mention", "activitystreams:place", "dc:identifier", 
"dc:language", "dcterms:abstract", "dcterms:abstract_en", "dcterms:creator", 
"dcterms:description", "dcterms:issued", "dcterms:title", "dcterms:type"
), class = "factor"), meta_value = structure(c(11L, 12L, 5L, 
2L, 6L, 9L, 4L, 1L, 8L, 10L, 7L, 3L), .Label = c("2014-06-20", 
"DLIB – OSS 2014", "http://ir.las.ac.cn/handle/12502/7163", 
"This presentation introduces the relevant concepts, value, and issues of rights management. Then, speaker presents his cases analysis, and to proposal need of information instracture for data citation, data publishing, data metrics and data sharing. Finally, introduced by the reporter's recent 2 years work to this topic, this presentation suggests a collaboration proposal", 
"顾立平", "湖南长沙", "科学数据开放共享的权益政策问题与基础设施需求", 
"科研数据\n;  Research Data\n;  权益管理\n;  rights management\n;  数据共享\n;  data sharing\n;  数据引用\n;  data citation\n;  数据出版\n;  data publishing\n;  数据计量\n;  data metrics", 
"演示报告", "英语", "这次演讲首先介绍了科研数据开放共享的相关概念、意义,以及权益管理的议题。接着,根据案例分析提出对数据引用、数据发布、数据计量和数据共享上的信息基础设施需求。最后,通过演讲者介绍最近2年在这个主题上的工作,提出合作建议", 
"中国科学院文献情报中心_业务处_演示报告"), class = "factor")), row.names = c(NA, 
-12L), class = "data.frame")

The JSON file which I want to transform into by this dataframe just like below:

{
["dcterms:title": "科学数据开放共享的权益政策问题与基础设施需求"]
["dcterms:description": "科研数据\n;  Research Data\n;  权益管理\n;  rights management\n;  数据共享\n;  data sharing\n;  数据引用\n;  data citation\n;  数据出版\n;  data publishing\n;  数据计量\n;  data metrics"]
["dcterms:abstract": "这次演讲首先介绍了科研数据开放共享的相关概念、意义,以及权益管理的议题。接着,根据案例分析提出对数据引用、数据发布、数据计量和数据共享上的信息基础设施需求。最后,通过演讲者介绍最近2年在这个主题上的工作,提出合作建议"]
[ "dcterms:creator": "顾立平"]
["dcterms:type": "演示报告"]
[ "dcterms:issued":"2014-06-20"]
} 

No matter how to change the parameters in the function, I still could not get the result like the form above. The malfunction of format conversion performs in 3 aspects: first, the row names(or column names) metadata in data frame should be transfer into “string” in json, and the value of data frame should transfer into “value” in json. But the results often in the reverse direction;

> itemldm <- data.frame(itemld[, "meta_value"], row.names = itemld[, "ldv"], check.names = TRUE)
> itemj <- toJSON(itemldm, dataframe = "values", pretty = TRUE)
> itemj
[
  ["这次演讲首先介绍了科研数据开放共享的相关概念、意义,以及权益管理的议题。接着,根据案例分析提出对数据引用、数据发布、数据计量和数据共享上的信息基础设施需求。最后,通过演讲者介绍最近2年在这个主题上的工作,提出合作建议", "dcterms:abstract"],
  ["中国科学院文献情报中心_业务处_演示报告", "activitystreams:Mention"],
  ["顾立平", "dcterms:creator"],
  ["DLIB – OSS 2014", "activitystreams:Event"],
  ["湖南长沙", "activitystreams:place"],
  ["演示报告", "dcterms:type"],
  ["This presentation introduces the relevant concepts, value, and issues of rights management. Then, speaker presents his cases analysis, and to proposal need of information instracture for data citation, data publishing, data metrics and data sharing. Finally, introduced by the reporter's recent 2 years work to this topic, this presentation suggests a collaboration proposal", "dcterms:abstract_en"],
  ["2014-06-20", "dcterms:issued"],
  ["科研数据\n;  Research Data\n;  权益管理\n;  rights management\n;  数据共享\n;  data sharing\n;  数据引用\n;  data citation\n;  数据出版\n;  data publishing\n;  数据计量\n;  data metrics", "dcterms:description"],
  ["英语", "dc:language"],
  ["科学数据开放共享的权益政策问题与基础设施需求", "dcterms:title"],
  ["http://ir.las.ac.cn/handle/12502/7163", "dc:identifier"]
] 

Second, column names and row names some times may divorce values in the data frame some times interfere with each other during conversion from data frame to json.
As the example below, row names or column names are transfer to the “value” not “name” in the json files when columns has been chosen in “dataframe”.

> itemj <- toJSON(itemld[,c("ldv", "meta_value")], dataframe = c("rows"), pretty = TRUE)
> itemj
[
  {
    "ldv": "dcterms:abstract",
    "meta_value": "这次演讲首先介绍了科研数据开放共享的相关概念、意义,以及权益管理的议题。接着,根据案例分析提出对数据引用、数据发布、数据计量和数据共享上的信息基础设施需求。最后,通过演讲者介绍最近2年在这个主题上的工作,提出合作建议"
  },
  {
    "ldv": "activitystreams:Mention",
    "meta_value": "中国科学院文献情报中心_业务处_演示报告"
  },
  {
    "ldv": "dcterms:creator",
    "meta_value": "顾立平"
  },
  {
    "ldv": "activitystreams:Event",
    "meta_value": "DLIB – OSS 2014"
  },
  {
    "ldv": "activitystreams:place",
    "meta_value": "湖南长沙"
  },
  {
    "ldv": "dcterms:type",
    "meta_value": "演示报告"
  },
  {
    "ldv": "dcterms:abstract_en",
    "meta_value": "This presentation introduces the relevant concepts, value, and issues of rights management. Then, speaker presents his cases analysis, and to proposal need of information instracture for data citation, data publishing, data metrics and data sharing. Finally, introduced by the reporter's recent 2 years work to this topic, this presentation suggests a collaboration proposal"
  },
  {
    "ldv": "dcterms:issued",
    "meta_value": "2014-06-20"
  },
  {
    "ldv": "dcterms:description",
    "meta_value": "科研数据\n;  Research Data\n;  权益管理\n;  rights management\n;  数据共享\n;  data sharing\n;  数据引用\n;  data citation\n;  数据出版\n;  data publishing\n;  数据计量\n;  data metrics"
  },
  {
    "ldv": "dc:language",
    "meta_value": "英语"
  },
  {
    "ldv": "dcterms:title",
    "meta_value": "科学数据开放共享的权益政策问题与基础设施需求"
  },
  {
    "ldv": "dc:identifier",
    "meta_value": "http://ir.las.ac.cn/handle/12502/7163"
  }
] 

Last but not the least, column names and row names of data frame may lost during format conversion.

> itemj <- toJSON(itemld[,c("ldv", "meta_value")], dataframe = c("columns"), pretty = TRUE)
> itemj
{
  "ldv": ["dcterms:abstract", "activitystreams:Mention", "dcterms:creator", "activitystreams:Event", "activitystreams:place", "dcterms:type", "dcterms:abstract_en", "dcterms:issued", "dcterms:description", "dc:language", "dcterms:title", "dc:identifier"],
  "meta_value": ["这次演讲首先介绍了科研数据开放共享的相关概念、意义,以及权益管理的议题。接着,根据案例分析提出对数据引用、数据发布、数据计量和数据共享上的信息基础设施需求。最后,通过演讲者介绍最近2年在这个主题上的工作,提出合作建议", "中国科学院文献情报中心_业务处_演示报告", "顾立平", "DLIB – OSS 2014", "湖南长沙", "演示报告", "This presentation introduces the relevant concepts, value, and issues of rights management. Then, speaker presents his cases analysis, and to proposal need of information instracture for data citation, data publishing, data metrics and data sharing. Finally, introduced by the reporter's recent 2 years work to this topic, this presentation suggests a collaboration proposal", "2014-06-20", "科研数据\n;  Research Data\n;  权益管理\n;  rights management\n;  数据共享\n;  data sharing\n;  数据引用\n;  data citation\n;  数据出版\n;  data publishing\n;  数据计量\n;  data metrics", "英语", "科学数据开放共享的权益政策问题与基础设施需求", "http://ir.las.ac.cn/handle/12502/7163"]
} 

I tried to transpose this dataframe, in other words, change those rows into columns and make the column names mapping to the values. itemldm <- t(data.frame(itemld[, "meta_value"], row.names = itemld[, "ldv"], check.names = TRUE)) The results below are the tests on different parameters of “dataframe”:

> itemj <- toJSON(itemldm, dataframe = "rows", pretty = TRUE)
> itemj
[
  ["这次演讲首先介绍了科研数据开放共享的相关概念、意义,以及权益管理的议题。接着,根据案例分析提出对数据引用、数据发布、数据计量和数据共享上的信息基础设施需求。最后,通过演讲者介绍最近2年在这个主题上的工作,提出合作建议", "中国科学院文献情报中心_业务处_演示报告", "顾立平", "DLIB – OSS 2014", "湖南长沙", "演示报告", "This presentation introduces the relevant concepts, value, and issues of rights management. Then, speaker presents his cases analysis, and to proposal need of information instracture for data citation, data publishing, data metrics and data sharing. Finally, introduced by the reporter's recent 2 years work to this topic, this presentation suggests a collaboration proposal", "2014-06-20", "科研数据\n;  Research Data\n;  权益管理\n;  rights management\n;  数据共享\n;  data sharing\n;  数据引用\n;  data citation\n;  数据出版\n;  data publishing\n;  数据计量\n;  data metrics", "英语", "科学数据开放共享的权益政策问题与基础设施需求", "http://ir.las.ac.cn/handle/12502/7163"]
] 
> itemj <- toJSON(itemldm, dataframe = "columns", pretty = TRUE)
> itemj
[
  ["这次演讲首先介绍了科研数据开放共享的相关概念、意义,以及权益管理的议题。接着,根据案例分析提出对数据引用、数据发布、数据计量和数据共享上的信息基础设施需求。最后,通过演讲者介绍最近2年在这个主题上的工作,提出合作建议", "中国科学院文献情报中心_业务处_演示报告", "顾立平", "DLIB – OSS 2014", "湖南长沙", "演示报告", "This presentation introduces the relevant concepts, value, and issues of rights management. Then, speaker presents his cases analysis, and to proposal need of information instracture for data citation, data publishing, data metrics and data sharing. Finally, introduced by the reporter's recent 2 years work to this topic, this presentation suggests a collaboration proposal", "2014-06-20", "科研数据\n;  Research Data\n;  权益管理\n;  rights management\n;  数据共享\n;  data sharing\n;  数据引用\n;  data citation\n;  数据出版\n;  data publishing\n;  数据计量\n;  data metrics", "英语", "科学数据开放共享的权益政策问题与基础设施需求", "http://ir.las.ac.cn/handle/12502/7163"]
]