funginstitute / patentprocessor

BSD 2-Clause "Simplified" License
68 stars 31 forks source link

USApplicationCitation Trend looks strange (bug?) #58

Closed laironald closed 10 years ago

laironald commented 11 years ago

Stats for USApplicationCitation looks unusual.

Is this OK or is there something wrong with the Parser?

year    file                    records     %incr   %yr/rec  %per-pat
2005    usapplicationcitation   79,024              0.87%    0.50
2006    usapplicationcitation   275,299     248.37% 2.43%    1.43
2007    usapplicationcitation   396,997     44.21%  3.62%    2.17
2008    usapplicationcitation   539,260     35.83%  4.80%    2.91
2009    usapplicationcitation   796,314     47.67%  6.38%    4.15
2010    usapplicationcitation   1,333,006   67.40%  7.77%    5.45
2011    usapplicationcitation   1,722,692   29.23%  9.34%    6.94
2012    usapplicationcitation   2,292,871   33.10%  11.09%   8.27
2013    usapplicationcitation   1,341,838   17.04%  11.89%   9.30
total   usapplicationcitation   8,777,301           7.16%    4.81

%incr = increase from previous year %yr/rec = the breakdown vs the total number of records for the year %per-pat = number of items per patent

In this case, the %per-pat and %yr/rec look unusual esp when comparing to others. Might be OK but I'm not sure based on the data.

Google Spreadsheet

gtfierro commented 11 years ago

I took a quick look at some of the data from a couple of weeks in 2013 and so far it looks okay. I think citations have really shot up in recent years along with the influx of patents. I'll ask Lee if the numbers make sense next time I see him.

laironald commented 11 years ago

I don't think 2013 is the issue. I think 2005 looks weird to me

On Thursday, August 22, 2013, Gabe Fierro wrote:

I took a quick look at some of the data from a couple of weeks in 2013 and so far it looks okay. I think citations have really shot up in recent years along with the influx of patents. I'll ask Lee if the numbers make sense next time I see him.

— Reply to this email directly or view it on GitHubhttps://github.com/funginstitute/patentprocessor/issues/58#issuecomment-23131891 .

Sent from mobile

gtfierro commented 11 years ago

Oops, should've taken more than a cursory glance at the table, sorry.

I'll look at the 2005 files tomorrow -- this is very likely an XML schema error.

laironald commented 11 years ago

no problem!

On Thursday, August 22, 2013, Gabe Fierro wrote:

Oops, should've taken more than a cursory glance at the table, sorry.

I'll look at the 2005 files tomorrow -- this is very likely an XML schema error.

— Reply to this email directly or view it on GitHubhttps://github.com/funginstitute/patentprocessor/issues/58#issuecomment-23136515 .

Sent from mobile

gtfierro commented 11 years ago

Now that I've looked at the XML files, everything seems solid, so I can't think of why the numbers are so much lower. I guess there were just drastically fewer citations of applications back in 2005.

laironald commented 11 years ago

hrm interesting.

I have a theory. patent applications became publicly released like 2004 or something like that? I think it's something to do with that

meaning now that patent applications are available, patents started to cite them vs before since it wasn't public less frequent ability to cite because it was harder to know about the application patents!

can you check the years of application patents and run this logic by Lee.

On Friday, August 23, 2013, Gabe Fierro wrote:

Now that I've looked at the XML files, everything seems solid, so I can't think of why the numbers are so much lower. I guess there were just drastically fewer citations of applications back in 2005.

— Reply to this email directly or view it on GitHubhttps://github.com/funginstitute/patentprocessor/issues/58#issuecomment-23176365 .

Sent from mobile

gtfierro commented 11 years ago

Application data on the Google mirror only goes back to 2001, so that seems likely.

laironald commented 11 years ago

kk. we should run it by Lee.

On Friday, August 23, 2013, Gabe Fierro wrote:

Application data on the Google mirrorhttp://www.google.com/googlebooks/uspto-patents-applications-text.htmlonly goes back to 2001, so that seems likely.

— Reply to this email directly or view it on GitHubhttps://github.com/funginstitute/patentprocessor/issues/58#issuecomment-23186308 .

Sent from mobile

gtfierro commented 11 years ago

Lee says to compare with USPTO. I'm super busy, so I asked Kevin S to look into it.

gtfierro commented 10 years ago

This has been addressed by the application parsing