Closed paulalbert1 closed 7 years ago
The logic should be "[letter][letter][5 or 6 digit number] and then if anything that is not a number appears such as a space or dash or parenthesis, that’s the end of the string…. Don’t worry about the edge cases. These data can be messy.
Please provide more information on this.. do we need to parse these information before inserting into DB table or do we need to extract these information while parsing XML files from PubMed database?
Hi Jie/Paul, What needs to be done next after doing cross reference check ... We can get the grant/fund id's from PubMed and can verify from DB table ... after extracting and matching grant id's is there any step we need to do ? ... or do we need to update the these ID's with existing sponserAwardedId's ??, whatever we do extracting is temporary basis .. because we are not doing anything after that.. Please advise
Hi Jin,
Please provide where is the possible location of this code to be implanted?... please provide pseudo code
Hi Balu,
If there is a match, fundingStatementScore should be assigned a value of 1. This score is referenced in the following locations in the code:
/src/main/java/reciter/utils/writer/AnalysisCSVWriter.java /src/main/java/reciter/erroranalysis/AnalysisObject.java /src/main/java/reciter/erroranalysis/AnalysisTranslator.java
Is this helpful? Please let us know if you need any additional details.
Yes, Thanks Michael... It helps to complete the code level implementation.
Hi Jie Lin / Michael,
If you look under grant support for the PubMed record, you see these declarations: •1 DP2 OD007399-01/OD/NIH HHS/United States •CA 140409/CA/NCI NIH HHS/United States
As stated above in the first statement for getting the PubMed Record , do we need to extract these information when reading XML file from AbstractXMLFetcher.java file? or any other way we can get these strings ... from existing class file... Please advise..
Hi Jie Lin / Michael / Paul,
I guess i can use the above PubMed record FundingStatement String from PubMedXMLFetcher.java class to get the String using CWID ... I hope i am at right direction to get this Funding String as per my logical analysis, Please advise if not ..
Jie reviewed the code and it looks good.
Hanumantha has integrated the code into ReCiterClusterer; he will update the code so that it will write a score to the CSV file or to the database.
Issue closed. We are currently using the Grant Ids from Oracle (since that is the same as the sponsor award) and I remember you mentioned that since we get the ids from Oracle, no parsing is now required.
Monica L. Guzman (mlg2007) wrote this paper: 25557492. A good piece of evidence is her (or her co-authors') statement in the PubMed record of where funding came from.
If you look under grant support for the PubMed record, you see these declarations:
Take a five or six-digit code and parse them like so:
Now, go see if these ID's are in rc_identity_grant.sponsorAwardId and cross-reference against CWID of mlg2007.
The last ID is listed in the sponsorAwardId field as 5 R21 CA158728-02, which matches against 158275. Ignore the characters before and after the six-digit codes. In the above example, ignore "5 R21 CA" and "-02".
Related to #49