oracle / opengrok

OpenGrok is a fast and usable source code search and cross reference engine, written in Java
http://oracle.github.io/opengrok/
Other
4.29k stars 739 forks source link

Support retrieving function definitions for a project path #4508

Closed ronvgs closed 6 months ago

ronvgs commented 7 months ago

Im working on a project and one of the tasks requires getting all function definitions from all code(cpp) files in a project path.

Describe the solution you'd like I would like opengrok api to take the project path as input and return the info(return type, name, params list, enclosing type name, namespace, etc) related to each function definition found in each source file in that path.

Describe alternatives you've considered None.

Additional context NA

vladak commented 7 months ago

The definitions for given file are stored in the index document and can be retrieved via IndexDocument#getDefinitions(File file). The Definitions object returned contains data such as map of symbol name to line number, map of line numbers to tag, and set of tags. Tag contains information such as line number, symbol name, type, full line string, namespace, signature, start/end offsets within the line. The question is what and how to represent this information in the API results. A good practice is to introduce a DTO (Data Transfer Object) to make this information available in the API. The DTO can contain a subset of the data from the Definitions object, it is just a question which data and in what form.

The API calls would obviously need to go through authorization checks.

vladak commented 7 months ago

Also, the above suggests the API endpoint would be limited to single file, i.e.be similar to the pre-existing genre API, i.e. api/v1/file/defs?path=

vladak commented 7 months ago

Also, pagination should be considered as the "tags" data for some files can be significant. On the other hand, I don't want to complicate this API endpoint too much.

vladak commented 7 months ago

Speaking of simplicity, it would seem to me that the API should merely present the list of tags, i.e. the contents of Definitions#tags list. The items of the list, Tag objects, contain various interesting pieces of information: https://github.com/oracle/opengrok/blob/6fb5eeefacb6d6322a342cd458d2dca9f6273893/opengrok-indexer/src/main/java/org/opengrok/indexer/analysis/Definitions.java#L209-L242

If one needs the line to tag mapping, that could be reconstructed from the list.

Would that match your use case @ronvgs ?

vladak commented 7 months ago

Here's a sample output for https://github.com/openssl/openssl/blob/master/crypto/aes/aes_cbc.c (the query URL would end with /api/v1/file/defs?path=/openssl-master/crypto/aes/aes_cbc.c):

[
  {
    "type": "function",
    "signature": "(const unsigned char * in,unsigned char * out,size_t len,const AES_KEY * key,unsigned char * ivec,const int enc)",
    "text": "void AES_cbc_encrypt(const unsigned char *in, unsigned char *out,",
    "symbol": "AES_cbc_encrypt",
    "lineStart": 5,
    "lineEnd": 20,
    "line": 20,
    "namespace": null
  },
  {
    "type": "argument",
    "signature": "(const unsigned char * in,unsigned char * out,size_t len,const AES_KEY * key,unsigned char * ivec,const int enc)",
    "text": "AES_cbc_encrypt(const unsigned char * in,unsigned char * out,size_t len,const AES_KEY * key,unsigned char * ivec,const int enc)",
    "symbol": "in",
    "lineStart": 21,
    "lineEnd": 44,
    "line": 20,
    "namespace": null
  },
  {
    "type": "argument",
    "signature": "(const unsigned char * in,unsigned char * out,size_t len,const AES_KEY * key,unsigned char * ivec,const int enc)",
    "text": "AES_cbc_encrypt(const unsigned char * in,unsigned char * out,size_t len,const AES_KEY * key,unsigned char * ivec,const int enc)",
    "symbol": "out",
    "lineStart": 46,
    "lineEnd": 64,
    "line": 20,
    "namespace": null
  },
  {
    "type": "argument",
    "signature": "(const unsigned char * in,unsigned char * out,size_t len,const AES_KEY * key,unsigned char * ivec,const int enc)",
    "text": "AES_cbc_encrypt(const unsigned char * in,unsigned char * out,size_t len,const AES_KEY * key,unsigned char * ivec,const int enc)",
    "symbol": "len",
    "lineStart": 21,
    "lineEnd": 31,
    "line": 21,
    "namespace": null
  },
  {
    "type": "argument",
    "signature": "(const unsigned char * in,unsigned char * out,size_t len,const AES_KEY * key,unsigned char * ivec,const int enc)",
    "text": "AES_cbc_encrypt(const unsigned char * in,unsigned char * out,size_t len,const AES_KEY * key,unsigned char * ivec,const int enc)",
    "symbol": "key",
    "lineStart": 33,
    "lineEnd": 51,
    "line": 21,
    "namespace": null
  },
  {
    "type": "argument",
    "signature": "(const unsigned char * in,unsigned char * out,size_t len,const AES_KEY * key,unsigned char * ivec,const int enc)",
    "text": "AES_cbc_encrypt(const unsigned char * in,unsigned char * out,size_t len,const AES_KEY * key,unsigned char * ivec,const int enc)",
    "symbol": "ivec",
    "lineStart": 21,
    "lineEnd": 40,
    "line": 22,
    "namespace": null
  },
  {
    "type": "argument",
    "signature": "(const unsigned char * in,unsigned char * out,size_t len,const AES_KEY * key,unsigned char * ivec,const int enc)",
    "text": "AES_cbc_encrypt(const unsigned char * in,unsigned char * out,size_t len,const AES_KEY * key,unsigned char * ivec,const int enc)",
    "symbol": "enc",
    "lineStart": 42,
    "lineEnd": 55,
    "line": 22,
    "namespace": null
  }
]