Closed aminvielledebatAtBedrock closed 1 month ago
The following patch works but some file informations are not present :
--- a/github/table_github_repository_content.go
+++ b/github/table_github_repository_content.go
@@ -106,6 +106,13 @@ func getFileContents(ctx context.Context, d *plugin.QueryData, h *plugin.Hydrate
}
}
} `graphql:"... on Tree"`
+ Blob struct {
+ Oid githubv4.String
+ AbbreviatedOid githubv4.String
+ Text githubv4.String
+ IsBinary githubv4.Boolean
+ CommitUrl githubv4.String
+ } `graphql:"... on Blob"`
} `graphql:"object(expression: $expression)"`
} `graphql:"repository(owner: $owner, name: $repo)"`
}
@@ -127,6 +134,18 @@ func getFileContents(ctx context.Context, d *plugin.QueryData, h *plugin.Hydrate
return err
}
+ if query.Repository.Object.Blob.Oid != "" {
+ data := query.Repository.Object.Blob
+ c := ContentInfo{
+ Oid: string(data.Oid),
+ AbbreviatedOid: string(data.AbbreviatedOid),
+ Content: string(data.Text),
+ IsBinary: bool(data.IsBinary),
+ CommitUrl: string(data.CommitUrl),
+ }
+ d.StreamListItem(ctx, c)
+ }
+
for _, data := range query.Repository.Object.Tree.Entries {
if string(data.Type) != "tree" {
c := ContentInfo{
The problems seems to come from the graphQL query
Returns nothing :
query($expression:String!$owner:String!$repo:String!){
rateLimit{
remaining,used,cost,limit,resetAt,nodeCount
},
repository(owner: $owner, name: $repo){
object(expression: $expression){
... on Tree{
oid,abbreviatedOid,entries{
name,path,size,lineCount,mode,pathRaw,isGenerated,type,object{
... on Blob{
oid,abbreviatedOid,text,isBinary,commitUrl}}}
}
}
}
}
Returns my file content :
query($expression:String!$owner:String!$repo:String!){
rateLimit{
remaining,used,cost,limit,resetAt,nodeCount
},
repository(owner: $owner, name: $repo){
object(expression: $expression){
... on Tree{
oid,abbreviatedOid,entries{
name,path,size,lineCount,mode,pathRaw,isGenerated,type,object{
... on Blob{
oid,abbreviatedOid,text,isBinary,commitUrl}}}
}
... on Blob {
oid, abbreviatedOid text,isBinary,commitUrl
}
}
}
}
Maybe it's related to my GHE instance : Version 3.12.4
cc @ParthaI and @graza-io
Thanks @aminvielledebatAtBedrock, for the detailed information. I will take a look at it.
Hello @aminvielledebatAtBedrock,
I was looking into the issue you mentioned, and it seems the behavior is intended. The table provides the file content under a specified folder path. If no folder path is specified, it fetches all the content under the repository.
I agree that the suggestion you provided is working fine.
If you are looking for a particular file content, could you please try the following:
github_repository_content
with different query parameters like: select * from github_repository_content where repository_full_name = 'turbot/steampipe-plugin-github' and name = 'Makefile'
.github_blob
and github_tree
which may help you get the content of a specific file.Could you please give these suggestions a try and let us know if they help?
Thanks!
Indeed it works but your graphql query looks like :
query($expression:String!$owner:String!$repo:String!){
rateLimit{
remaining,used,cost,limit,resetAt,nodeCount
},
repository(owner: $owner, name: $repo){
object(expression: $expression){
... on Tree{
oid,abbreviatedOid,entries{
name,path,size,lineCount,mode,pathRaw,isGenerated,type,object{
... on Blob{
oid,abbreviatedOid,text,isBinary,commitUrl}}}
}
}
}
}
with expresion
equals to HEAD:
. It means you fetch all the files at root for a repository. If you want only one file in hundreds of repositories, you loose your time so much.
For other table, it means you've got to find the sha before fetching your HEAD, and actually, I'm note able to decode my file content with github_blob
table :cry:
Hi @aminvielledebatAtBedrock,
with expresion equals to HEAD:. It means you fetch all the files at root for a repository. If you want only one file in hundreds of repositories, you loose your time so much.
Hmm, makes sense. It is definitely time-consuming.
For other table, it means you've got to find the sha before fetching your HEAD, and actually, I'm note able to decode my file content with github_blob table 😢
Yes, we have to find the SHA first to query the github_blob
table, which requires more input to query.
However, I have raised a PR to address the above discussion.
Usage overview:
github_repository_content
table.repository_full_name
and path
to get a particular file's content in a repository.> select type, name, path, path_raw,mode, line_count from github_repository_content where repository_full_name = 'turbot/steampipe-plugin-github' and path = 'docs/tables/github_audit_log.md'
+------+---------------------+---------------------------------+----------------------------------------------+------+------------+
| type | name | path | path_raw | mode | line_count |
+------+---------------------+---------------------------------+----------------------------------------------+------+------------+
| blob | github_audit_log.md | docs/tables/github_audit_log.md | ZG9jcy90YWJsZXMvZ2l0aHViX2F1ZGl0X2xvZy5tZA== | 0 | 204 |
+------+---------------------+---------------------------------+----------------------------------------------+------+------------+
I hope this PR will meet our requirements. Please give it a try and share your feedback.
Thanks!
Hey @ParthaI and thank for your PR. It seems to be perfect :)
A simple question : is tolerated to use the column path
for GET and let repositoy_content_path
for LIST ?
Thank you so much
Hi @aminvielledebatAtBedrock,
A simple question : is tolerated to use the column path for GET and let repositoy_content_path for LIST ?
I think so, as per the code changes in this PR, the updates align with our other plugin development standards.
If you have any further requirements, please let us know, such as needing selective files at a time.
We are moving the conversation here: The implementation does not look correct since the two columns mentioned in the comment below are not returned by the GetConfig.
Regarding the Steampipe plugin standards:
Need suggestions:
I did not find a relevant way to present the column values for IsGenerated
and Mode
with the suggestions provided or with the implementation made in the above-mentioned PR.
@misraved and I discussed this, and it does not seem optimal according to our table development standards that the List API call provides more details than the Get API call.
Do you have any suggestions (specifically a GitHub GraphQL query) that will return all the column values we have today?
Any suggestions would be highly appreciated!
CC @graza-io, @misraved
Thank you!
Describe the bug
The table
github_repository_content
is not able to return only one file whenrepository_content_path
is used.Steampipe version (
steampipe -v
) v0.21.8Plugin version (
steampipe plugin list
) v0.42.0To reproduce
Expected behavior One record with the gitleaks.toml file.
Additional context With the following query, i get my record :