ChristianGerloff / findpapers

Findpapers is an application that helps researchers who are looking for references for their work
MIT License
2 stars 2 forks source link

integrate references to the findpapers response #11

Open kashyapm94 opened 2 years ago

kashyapm94 commented 2 years ago

The current output of the findpapers.Paper contains the DOI of each paper that has been extracted. In order to integrate the references to the findpapers response, the following steps have to be done:

ChristianGerloff commented 2 years ago

Thx @kashyapm94 . Great 😃 Perhaps a few thoughts. If it makes sense, we can separate the task into multiple issues and divide the workload.

Please note: the docstring format below was copycatted. We should stick to our preferred gstyle docstrings for new classes if you like.

suggestions

references tool

Wrong type of search

search is not of type dict, its a specific class defined in models. Reference tool can be located in tools module, but in this case, it should perhaps be more generic, and I would suggest the following changes:

get_references_doi:

get_references_doi(doi: str) -> List(str): """ Get references from opencitation api based on the doi.

    Returns:
        List[str]: dois of references cited by input doi
    """

select

Select means that a paper is considered for the next review step. Select should only be considered in cross_reference method (see below)

opencitations_searcher - new class in searchers module

To avoid complexity, we can directly use opencitations to add new papers from reference dois via https://opencitations.net/index/api/v1/metadata/

possible structure

def _get_paper_entry(doi str) -> dict:
    """
    This method return paper metadata from opencitations database using the doi

    Parameters
    ----------
    pubmed_id : str
        A doi

    Returns
    -------
    dict
        a paper entry from opencitations
    """
..... some request stuff
def _get_publication(paper_entry: dict) -> Publication:
    Parameters
    ----------
    paper_entry : dict
        A paper entry retrieved from opencitations 

    Returns
    -------
    Publication
        A publication instance
def _get_paper(paper_entry: dict, publication: Publication) -> Paper:
    """
    Creates paper instance from paper entry

    Parameters
    ----------
    paper_entry : dict
        A paper entry retrieved from opencitations API
    publication: Publication
        A publication instance that will be associated with the paper

    Returns
    -------
    Paper
        A paper instance or None
    """
.... you can also directly add the references from the opencitation response (so the refs of cross-refs ;))
.... attention this should set paper.cross-reference=True
def run(search: Search):
    """
    This method fetches papers from opencitations database using the provided search parameters.
    The collected papers are added to the search instance

    Parameters
    ----------
    search : Search
        A search instance

missing class properties of paper

Both properties should be added as a class property because each paper object comes from the class paper. To avoid further adjustments/breaking changes, you could use getters and setters.

cross-reference a method of search class

It is a function that can be called from search_runner and beyond.

def cross-reference(search: Search):
    """
    This function extends current search results by adding new papers from the reference property of each original paper in the search object using opencitation.

    Parameters
    ----------
    search : Search
        A search instance
    Returns
    -------
    Search
        A search instance or None
    """