iuni-cadre / Collaborative-projects

For non-fellow collaborative projects on CADRE
9 stars 0 forks source link

Management science paper extraction (Yangyang Chang) #2

Open XiaoranYan opened 5 years ago

XiaoranYan commented 5 years ago

Hi Xiaoran,

Thank you very much for the dataset. We'll download and take a look at it in these days.

Best,

Yangyang​

发件人: Yan, Xiaoran 发送时间: 2019年2月17日 22:18 收件人: Chang, Yangyang 抄送: Ding, Ying; Hutchinson, Matthew Alexander; Ma, He; Pentchev, Valentin; Mabry, Patricia L 主题: RE: [Update]_Journal list of management science

Hi Yangyang,

Here is my first take of your requested dataset. The data is from WoS and current only has the papers in the listed journals. You can download with the following link (will be valid for a week, and I suggest you use a download manager to open the link):

https://iunimag.blob.core.windows.net/mag-2019-01-25/data-1550457554868.csv.gz?st=2019-02-18T02%3A49%3A14Z&se=2019-02-26T02%3A49%3A00Z&sp=rl&sv=2017-07-29&sr=b&sig=i1LAoVDNAKXJhtATuK%2BWUansGeKKJW879ne%2BMlczYz4%3D

Once unpackaged, the csv file will be 300GB with 191993 papers. I added one more column “addressverified” to flag those papers with “enhanced affiliations” from WoS, which means their authors and addresses are clearly mapped in order.

Please download and get back to me if you find any problems. From my experience, it takes a few updates to finalize as your research progress. I will be producing another dataset for the external citations (the refs in the current file already contains external documents but their information is not listed).

Thank you!

Xiaoran

From: Chang, Yangyang Sent: Wednesday, February 13, 2019 8:58 PM To: Yan, Xiaoran yan30@iu.edu Subject: [Update]_Journal list of management science

Hi Xiaoran,

We update our journal list and add another three journals. If you didn't start collecting our data I guess you can use the new list:

Management Science (MS)

Operations Research (OR)

European Journal of Operational Research (EJOR)

Journal of the Operational Research Society (JORS)

Annals of Operations Research (AOR)

Operations Research Letters (OR Letters)

Interfaces 

Computers and Operations Research (COR)

Decision Sciences (DS)

IIE Transactions (IIE)

International Journal of Operations and Production Management (IJOPM)

International Journal of Production Research (IJPR) 

Journal of Operations Management (JOM)

Production and Operations Management Journal (POM)

Manufacturing & Service Operations Management (MSOM)​

Naval Research Logistics

INFORMS Journal on Computing

Mathematics of Operations Research

Thanks! :)

Yangyang

发件人: Chang, Yangyang 发送时间: 2019年2月5日 21:14 收件人: Yan, Xiaoran 抄送: Ding, Ying 主题: 答复: Journal list of management science

Hi Xiaoran,

Thanks for doing that. Here is some specific information of our data requirement:

  1. doi/id, title, authors, author affiliations, country, publishing year, journal, abstract, keywords, document type (research article, note, communication etc.);
  2. internal citations;
  3. external citations (if possible): discipline/field information and the information in entry 1 of the references and citations of the articles in the journal list.

Best,

Yangyang

发件人: Yan, Xiaoran 发送时间: 2019年2月4日 14:56 收件人: Chang, Yangyang 抄送: Ding, Ying 主题: Re: Journal list of management science

Hi Yangyang,

Thank you for the list. We are in the process of moving to a new server, but I shall be able to get the dataset in 2 weeks.

Can you also specify other features besides abstracts and citations? Authors, institutions, etc. For citations, I imagine you would need not just internal citations between these journals. For external citations, what information do you need? The same as the papers in the list?

Thank you!

Xiaoran

On 1/31/19 4:45 PM, Chang, Yangyang wrote:

Hi Xiaoran,

Here is the journal list of operations research/ operations management. Let's try if we can get a more complete and detailed dataset of the abstracts and citations in your data platform. Thanks!

Management Science (MS)

Operations Research (OR)

European Journal of Operational Research (EJOR)

Journal of the Operational Research Society (JORS)

Annals of Operations Research (AOR)

Operations Research Letters (OR Letters)

Interfaces

Computers and Operations Research (COR)

Decision Sciences (DS)

IIE Transactions (IIE)

International Journal of Operations and Production Management (IJOPM)

International Journal of Production Research (IJPR)

Journal of Operations Management (JOM)

Management Science (MS)

Production and Operations Management Journal (POM)

Manufacturing & Service Operations Management (MSOM)​

Best,

Yangyang
XiaoranYan commented 5 years ago

Hi Yangyang,

The external citation dataset is ready. I added one new column “cited” to flag if the paper is citing or being cited by the core dataset. You can download it from here

https://iunimag.blob.core.windows.net/mag-2019-01-25/YangyangExtra.csv.gz?st=2019-02-22T22%3A01%3A23Z&se=2019-03-02T22%3A01%3A00Z&sp=rl&sv=2017-07-29&sr=b&sig=aQYhiNJz7KXjrbx%2B8We9Xv%2B0XH%2BwRoPsHqdjujk8fhE%3D

Please check the dataset and let me know if there is any problem.

Thanks!

Xiaoran

changyy11 commented 5 years ago

Hi Xiaoran,

Sorry for the late reply. As I worked on the dataset, I found there were some problems with our last data.

For the core dataset, it has 178001 records, which sums up to 30G. However, there were many duplications in the attribute "abstract", "keywords", and "refs". The abstracts were duplicated for several times in one record. Same problems were found in the extra dataset. Could you help us to figure out that?

Besides, is it possible for us to get the citing sentences of the core dataset? That could be really helpful.

Thanks!

Yangyang

XiaoranYan commented 5 years ago

Hi Yangyang,

Thank you for the feedback! We realize that the current implementation in terms of abstract and keywords are not perfect. We had a similar issue with another user earlier. The root of the problem came from the raw WoS data and we are trying to remedy it in our new CADRE system.

To help us debug, can you specify the particular records that have duplicates in your data? Please give us the WoSid and describe in details what columns are duplicated.

Thanks! Xiaoran

changyy11 commented 5 years ago

Hi Xiaoran,

I randomly examined some records in the dataset, they have the same duplicated problem.

For the core dataset, the duplicated attributes are "abstract", "keywords", and "refs". They data is duplicated if it is not NULL. Here is an example where we can see the duplicates: (WOS:000257343900020)

"Jones, Philip C.|Ohlmann, Jeffrey W.","Univ Iowa, Dept Management Sci, Iowa City, IA 52242 USA|Univ Iowa, Dept Management Sci, Iowa City, IA 52242 USA","USA|USA","EUROPEAN JOURNAL OF OPERATIONAL RESEARCH","WOS:000257343900020","10.1016/j.ejor.2007.08.033","Journal","Long-range timber supply planning for a vertically integrated paper mill","2008","[1]We consider a vertically integrated papermaking operation composed of an integrated pulp and paper mill with its regional supply network. Considering land procurement and harvest rotation as strategic decision variables, we construct a model to establish a long-range timber supply plan that minimizes the total discounted cost of meeting annual virgin wood fiber demand over an infinite horizon. Under appropriate assumptions on costs and storage, the land procurement and harvest rotation decisions are separable with harvest rotation being determined via a forest economics-type equation and land procurement being determined by a newsvendor-type equation. Published by Elsevier B.V.;[1]We consider a vertically integrated papermaking operation composed of an integrated pulp and paper mill with its regional supply network. Considering land procurement and harvest rotation as strategic decision variables, we construct a model to establish a long-range timber supply plan that minimizes the total discounted cost of meeting annual virgin wood fiber demand over an infinite horizon. Under appropriate assumptions on costs and storage, the land procurement and harvest rotation decisions are separable with harvest rotation being determined via a forest economics-type equation and land procurement being determined by a newsvendor-type equation. Published by Elsevier B.V.;[1]We consider a vertically integrated papermaking operation composed of an integrated pulp and paper mill with its regional supply network. Considering land procurement and harvest rotation as strategic decision variables, we construct a model to establish a long-range timber supply plan that minimizes the total discounted cost of meeting annual virgin wood fiber demand over an infinite horizon. Under appropriate assumptions on costs and storage, the land procurement and harvest rotation decisions are separable with harvest rotation being determined via a forest economics-type equation and land procurement being determined by a newsvendor-type equation. Published by Elsevier B.V.;[1]We consider a vertically integrated papermaking operation composed of an integrated pulp and paper mill with its regional supply network. Considering land procurement and harvest rotation as strategic decision variables, we construct a model to establish a long-range timber supply plan that minimizes the total discounted cost of meeting annual virgin wood fiber demand over an infinite horizon. Under appropriate assumptions on costs and storage, the land procurement and harvest rotation decisions are separable with harvest rotation being determined via a forest economics-type equation and land procurement being determined by a newsvendor-type equation. Published by Elsevier B.V.;[1]We consider a vertically integrated papermaking operation composed of an integrated pulp and paper mill with its regional supply network. Considering land procurement and harvest rotation as strategic decision variables, we construct a model to establish a long-range timber supply plan that minimizes the total discounted cost of meeting annual virgin wood fiber demand over an infinite horizon. Under appropriate assumptions on costs and storage, the land procurement and harvest rotation decisions are separable with harvest rotation being determined via a forest economics-type equation (...) [1]OR in agriculture;[2]forest economics;[3]normal forest;[4]regulated forest;[5]newsvendor model;[6]forestry supply chain management;[1]OR in agriculture;[2]forest economics;[3]normal forest;[4]regulated forest;[5]newsvendor model;[6]forestry supply chain management;[1]OR in agriculture;[2]forest economics;[3]normal forest;[4]regulated forest;[5]newsvendor model;[6]forestry supply chain management;[1]OR in agriculture;[2]forest economics;[3]normal forest;[4]regulated forest;[5]newsvendor model;[6]forestry supply chain management;[1]OR in agriculture;[2]forest economics;[3]normal forest;[4]regulated forest;[5]newsvendor model;[6]forestry supply chain management;[1]OR in agriculture;[2]forest economics;[3]normal forest;[4]regulated forest;[5]newsvendor model","WOS:000257343900020.2;WOS:000085158500015;WOS:000085158500015;WOS:000085158500015;WOS:000085158500015;WOS:000085158500015;WOS:000085158500015;WOS:000257343900020.11;WOS:000257343900020.11;WOS:000257343900020.11;WOS:000257343900020.11;WOS:000257343900020.11;WOS:000257343900020.11;WOS:A1986AXW5200020;WOS:A1986AXW5200020;WOS:A1986AXW5200020;WOS:A1986AXW5200020;WOS:A1986AXW5200020;WOS:A1986AXW5200020;WOS:000257343900020.26;WOS:000257343900020.26;WOS:000257343900020.26;WOS:000257343900020.26;WOS:000257343900020.26;WOS:000257343900020.26;WOS:A1996TU85300007;WOS:A1996TU85300007;WOS:A1996TU85300007;WOS:A1996TU85300007;WOS:A1996TU85300007;WOS:A1996TU85300007;WOS:A1990CU07500015;WOS:A1990CU07500015;WOS:A1990CU07500015;WOS:A1990CU07500015;WOS:A1990CU07500015;WOS:A1990CU07500015 (...)

Besides, I have some questions with the reference data. Here is an example (WOS:000268350400009):

"Havasi, Catherine|Lieberman, Henry|Pustejovsky, James|Speer, Robert","Brandeis Univ, Lab Linguist & Computat, Waltham, MA 02254 USA|MIT, Software Agents Grp, Cambridge, MA 02139 USA|Brandeis Univ, Lab Linguist & Computat, Waltham, MA 02254 USA|MIT, Media Labs Commonsense Comp Initiat, Cambridge, MA 02139 USA","USA|USA|USA|USA","IEEE INTELLIGENT SYSTEMS","WOS:000268350400009",NULL,"Journal","Digital Intuition: Applying Common Sense Using Dimensionality Reduction","2009",NULL,NULL,"WOS:000268350400009.12;WOS:000268350400009.7;WOS:A1995TC17500013;WOS:000268350400009.10;WOS:000268350400009.5;WOS:000268350400009.4;WOS:000268350400009.6;WOS:000182919000077;WOS:000268350400009.2;WOS:000268350400009.3;WOS:000268350400009.1;WOS:000268350400009.13;WOS:000224961900027","true"

The WoSid of the paper is "WOS:000268350400009" and the WoSid of one of the reference is "WOS:000268350400009.12". From the format of the id, It seems that it just represents that it is in the reference list of the paper. Does it match the WoSid in the extra dataset?

For the extra dataset, the duplicated attributes are "keywords" and "refs". Here is an example: (WOS:000211422000006)

Ghodrati\, Behzad|Kumar\, Uday,Lulea Univ Technol\, Div Operat & Maintenance Engn\, Lulea\, Sweden|Lulea Univ Technol\, Div Operat & Maintenance Engn\, Lulea\, Sweden,Sweden|Sweden,JOURNAL OF QUALITY IN MAINTENANCE ENGINEERING,WOS:000211422000006,10.1108/13552510510601366,Journal,APPLICATIONS AND CASE STUDIES Reliability and operating environment-based spare parts estimation approach,2005,[5]Originality/value-Previously\, the state of the specific technology and other factors have demonstrated the need for support in enhancing system effectiveness and preventing unexpected downtime. This paper sets the required number of spare parts necessary to fulfil this need.;[2]Design/methodology/approach-A model is provided in this paper to determine the number of required spare parts with respect to the effect of the external factors\, except time\, on the reliability characteristics of components through the proportional hazard model. The model is verified with estimation of the number of spare hydraulic jacks\, used on a load-haul-dump (LHD) machine\, as non-repairable components. The reliability of this non-repairable part and its operational impact are assessed\, while considering environmental factors and ignoring them.;[1]Purpose - With continuous technological development in the twenty-first century\, the industry and industrial systems have become complex and making their availability more critical. In this context\, the product support and its related issues such as spare parts play an important role. Lack of timely or incomplete support\, such as the lack of spare parts when required\, is likely to cause unexpected downtimes\, which in turn often lead to incompensatable losses. Therefore the importance of predicting the correct support to keep the system functionally available needs to be emphasized. Required number of spare parts could be obtained based on technical and life parameters. This paper seeks to examine the system reliability and operating environment\, which are the two parameters to be considered in this article.;[3]Findings - The results indicate that the operating environment of system/ machine has considerable influence on system performance. Forecasting the required support/ spare parts based on technical characteristics and the system-operating environment is an optimal way to prevent unplanned stoppages.;[4]Practical implications- The environmental conditions in which the equipment is to be operated\, such as temperature\, humidity\, dust\, road conditions\, maintenance facilities\, maintenance crew training\, operators' skill\, etc.\, often have considerable influence directly on the system/ machine or component reliability and indirectly on the product supportability characteristics. Spare parts\, are classified as a product support item whose availability is important when planned or unplanned maintenance is to be carried out. Forecasting the required number of spare parts\, based on technical characteristics and operating environmental conditions of a system\, is one of the best ways to optimize unplanned stoppages.,[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[5]Sweden;[4]Systems and control theory;[3]Distribution and inventory management;[2]Operations management;[1]Spare parts;[2]Operations management;[3]Distribution and inventory management;[4]Systems and control theory;[5]Sweden;[1]Spare parts;[2]Operations management;[3]Distribution and inventory management;[4]Systems and control theory;[5]Sweden;[1]Spare parts;[2]Operations management;[3]Distribution and inventory management;[4]Systems and control theory;[5]Sweden;[1]Spare parts;[2]Operations management;[3]Distribution and inventory management;[4]Systems and control theory;[5]Sweden;[1]Spare parts;[2]Operations management;[3]Distribution and inventory management;[4]Systems and control theory;[5]Sweden;[1]Spare parts;[2]Operations management;[3]Distribution and inventory management;[4]Systems and control theory;[5]Sweden;[1]Spare parts;[2]Operations management;[3]Distribution and inventory management;[4]Systems and control theory;[5]Sweden;[1]Spare parts;[2]Operations management;[3]Distribution and inventory management;[4]Systems and control theory;[5]Sweden;[1]Spare parts;[2]Operations management;[3]Distribution and inventory management;[4]Systems and control theory;[5]Sweden;[1]Spare parts;[2]Operations management;[3]Distribution and inventory management;[4]Systems and control theory;[5]Sweden;[1]Spare parts;[2]Operations management;[3]Distribution and inventory management;[4]Systems and control theory;[5]Sweden;[1]Spare parts;[2]Operations management;[3]Distribution and inventory management;[4]Systems and control theory;[5]Sweden;[1]Spare parts;[2]Operations management;[3]Distribution and inventory management;[4]Systems and control theory;[5]Sweden;[1]Spare parts;[2]Operations management;[3]Distribution and inventory management;[4]Systems and control theory;[5]Sweden,000211422000006.13;000211422000006.23;000211422000006.23;000211422000006.23;000211422000006.23;000211422000006.23;000211422000006.17;000211422000006.17;000211422000006.17;000211422000006.17;000211422000006.17;WOS:000174318200008;WOS:000174318200008;WOS:000174318200008;WOS:000174318200008;WOS:000174318200008;000211422000006.19;000211422000006.19;000211422000006.19;000211422000006.19;000211422000006.19;WOS:A1995QT55800001;WOS:A1995QT55800001;WOS:A1995QT55800001;WOS:A1995QT55800001;WOS:A1995QT55800001;WOS:A1987G015400049;WOS:A1987G015400049;WOS:A1987G015400049;WOS:A1987G015400049;WOS:A1987G015400049;000211422000006.4;000211422000006.4;000211422000006.4;000211422000006.4;000211422000006.4;000211422000006.16;000211422000006.16;000211422000006.16;000211422000006.16;000211422000006.16;000211422000006.21;000211422000006.21;000211422000006.21;000211422000006.21;000211422000006.21;000211422000006.26;000211422000006.26;000211422000006.26;000211422000006.26;000211422000006.26;000211422000006.24;000211422000006.24;000211422000006.24;000211422000006.24;000211422000006.24;WOS:000172033400018;WOS:000172033400018;WOS:000172033400018;WOS:000172033400018;WOS:000172033400018;000211422000006.11;000211422000006.11;000211422000006.11;000211422000006.11;000211422000006.11;WOS:000211422000006.22;WOS:000211422000006.22;WOS:000211422000006.22;WOS:000211422000006.22;WOS:000211422000006.22;000211422000006.31;000211422000006.31;000211422000006.31;000211422000006.31;000211422000006.31;000211422000006.12;000211422000006.12;000211422000006.12;000211422000006.12;000211422000006.12;WOS:A1985APN4600004;WOS:A1985APN4600004;WOS:A1985APN4600004;WOS:A1985APN4600004;WOS:A1985APN4600004;000211422000006.13;000211422000006.13;000211422000006.13;000211422000006.13;000211422000006.28;000211422000006.28;000211422000006.28;000211422000006.28;000211422000006.28;WOS:A1970F194200009;WOS:A1970F194200009;WOS:A1970F194200009;WOS:A1970F194200009;WOS:A1970F194200009;WOS:A1989AW96100004;WOS:A1989AW96100004;WOS:A1989AW96100004;WOS:A1989AW96100004;WOS:A1989AW96100004;000211422000006.30;000211422000006.30;000211422000006.30;000211422000006.30;000211422000006.30;000211422000006.20;000211422000006.20;000211422000006.20;000211422000006.20;000211422000006.20;WOS:A1972N572600003;WOS:A1972N572600003;WOS:A1972N572600003;WOS:A1972N572600003;WOS:A1972N572600003;WOS:000088258200004;WOS:000088258200004;WOS:000088258200004;WOS:000088258200004;WOS:000088258200004;WOS:A1985AST4800005;WOS:A1985AST4800005;WOS:A1985AST4800005;WOS:A1985AST4800005;WOS:A1985AST4800005;000211422000006.1;000211422000006.1;000211422000006.1;000211422000006.1;000211422000006.1;000211422000006.5;000211422000006.5;000211422000006.5;000211422000006.5;000211422000006.5;WOS:A1994NT83300010;WOS:A1994NT83300010;WOS:A1994NT83300010;WOS:A1994NT83300010;WOS:A1994NT83300010;000211422000006.27;000211422000006.27;000211422000006.27;000211422000006.27;000211422000006.27;WOS:000168933000014;WOS:000168933000014;WOS:000168933000014;WOS:000168933000014;WOS:000168933000014,true,cited

Hope this helps find the problem.

Thanks!

Yangyang

XiaoranYan commented 5 years ago

Hi Yangyang,

Sorry for the slow response. Thanks to your information, we were able to identify the problem and the updated data set can be downloaded at (link valid for a week) https://iunimag.blob.core.windows.net/mag-2019-01-25/YangyangPapersUpdated.csv.gz?st=2019-04-27T23%3A34%3A33Z&se=2019-05-04T23%3A34%3A00Z&sp=rl&sv=2017-07-29&sr=b&sig=d%2BIXoIQL5AHQUQyc8FeaoGX6ruKaOkAnmoCQWbwLzLU%3D

https://iunimag.blob.core.windows.net/mag-2019-01-25/YangyangExtraUpdated.csv.gz?st=2019-04-27T23%3A34%3A56Z&se=2019-05-04T23%3A34%3A00Z&sp=rl&sv=2017-07-29&sr=b&sig=IHxV47aCHgK3NGaNj9w8apYaY9BNunvcv8o17r3cEDs%3D

The new data fixed the duplicates in abstract and keywords, while corrected the order authors and their affiliations appear in the nested structure. Previously, the order was shuffled and does not reflect their appearance in papers.

As for references records with decimal numbers like "WOS:000268350400009.12", they mean such references are not found in the whole WoS collection. You will not be able to find a match even in the extend data set.