r-world-devs / GitAI

Extracting knowledge from Git repositories in R
Other
2 stars 0 forks source link

add repos metadata #23

Open kalimu opened 1 week ago

kalimu commented 1 week ago

We are not returning repos metadata together with processed content yet. Not sure what we should return though. Repo URL for sure,... maybe also what files what used to generate the output. Any thoughts @maciekbanas ?

Nothing special... repos URLs and used files should do the job for now. Maybe in future we will see what we need (timestamps, ids, authors, repo type, etc.). I would stay minimal for the time being, let's add what we see now is crucial.

Originally posted by @maciekbanas in #6

@maciekbanas Ah, you reminded me that we need some metadata also for checking cache (in the future). We could hash the file content but it take some processing time. Maybe it would be able to extract some timestamps of last modified repo or even files that we extracting and the the most recent one. We could save this timestamp for the repo and when it change we will know we should overwrite the db/cache.

maciekbanas commented 1 week ago

With GitStats get_repos() we get easily last push date, with get_files_content() I think we do not get date of the file. Maybe it would be good to make adjustments in GitStats first (new column for files table - date)?