ubiquity-os-marketplace / command-ask

0 stars 4 forks source link

fix: ignore file path and diffs #21

Closed sshivaditya2019 closed 4 weeks ago

sshivaditya2019 commented 1 month ago

Resolves #19

sshivaditya2019 commented 1 month ago

QA: issue

With the same 15 mb index.js file in the PR

github-actions[bot] commented 1 month ago

Unused devDependencies (1)

Filename devDependencies
package.json @types/diff

Unused exports (1)

Filename exports
src/helpers/issue.ts optimizeContext
sshivaditya2019 commented 1 month ago

QA: Issue

sshivaditya2019 commented 1 month ago

Perhaps my spec isnt clear but youre not following it.

  • Remove all the file ignores.
  • Count amount of changes per file.
  • sort in order of amount of changes
  • filter out files from most changes to least based on context limits.

This will automatically filter out large automated changes like compiled dist and lock files.

I think that counting the number of changes isn't the best metric. For instance, dist/index.js had only about 300 changes. A more effective metric would be the diff size in bytes.

I'm following the guidelines outlined in this comment. Is there another specification for this issue?

0x4007 commented 1 month ago

Basically this strategy would start by excluding dist and likely other large changes like lock file etc.

This could have been worded better. I meant that it would automatically exclude those based on how large their diffs are.

GitHub pull request code view UI handles this in a smart way, where it wont display the dist/index.js and lock files due to "large diffs"

Perhaps they also have a line length limitation. It may be wise to replicate.

Then again, 300 line changes is significant. Much more than a normal file diff i think?

This would almost certainly be filtered out if there were a ton of file changes with ~10 lines changed per each etc.

sshivaditya2019 commented 1 month ago

GitHub pull request code view UI handles this in a smart way, where it wont display the dist/index.js and lock files due to "large diffs"

It seems to use file size as a criterion to determine whether to show the UI, as those files wouldn't be accessible even in a regular file viewer without the diffs.

Then again, 300 line changes is significant. Much more than a normal file diff i think?

It is indeed significant, more than what you'd typically see in a standard file diff. However, while 300 lines could potentially be changed through code, the diff size in bytes for index.js was around 15.6 MB. In comparison, a code file with the same number of line changes would be much smaller in size.

0x4007 commented 1 month ago

Alright so then it seems clear to me that we can measure the diff size in bytes and then go from there.

sshivaditya2019 commented 4 weeks ago

@0x4007 I need some clarification. Currently, the system retrieves both filenames and the difference size in bytes, and then it fetches the diffs for each filename. Once that’s done, we sort the results in ascending order and continue adding to the context until we reach the maximum limit.

I followed along up to the sorting step in the specifications, but I'm unsure if I understand it correctly. What criteria should be used to select the diffs? Is it based on file size being under a certain threshold, or is there another method we should use?

0x4007 commented 4 weeks ago

I think that's all you need to do. What is the problem?

sshivaditya2019 commented 4 weeks ago
  • Filter out files from the most changes to the least, according to context limits.

This will automatically exclude large automated changes, such as compiled distribution and lock files.

Ok, I see now that I misunderstood these lines. I think this PR is ready to be merged. I'll post more QA for this.

sshivaditya2019 commented 4 weeks ago

QA: Link