Closed babsey closed 8 years ago
I hadn't been aware of smt list
writing to a file so far. As far as I can tell the only purpose of the generated file .smt/labels
is to allow tab completion of labels (see bin/smt-complete.sh
).
I can't comment on the performance implications of writing to a file in smt list
. My hunch is that the overhead of this is small compared to the time it takes to extract the records from the database (at least for Django record stores), but I haven't profiled this so I may be wrong.
Regardless, if the purpose of .smt/labels
really is to allow tab completion then it seems wrong to have this file generated by smt list
because this requires a manual step by the user and the file will easily get out of sync with existing records. Instead, it should be generated and automatically updated by smt run
and smt delete
commands. @apdavison, do you agree with this or am I missing something?
As far as I can tell the only purpose of the generated file .smt/labels is to allow tab completion of labels (see bin/smt-complete.sh)
This is correct.
The reason this slows things down is not the writing to file; it is that the current implementation ends up calling project.find_records()
twice.
it seems wrong to have this file generated by smt list because this requires a manual step by the user and the file will easily get out of sync with existing records. Instead, it should be generated and automatically updated by smt run and smt delete commands.
I agree with this. The smt run
part is easy, we just append to the end of .smt/labels
. Deletion is trickier: profiling will be needed to see whether it is more efficient to rewrite the file with the deleted labels removed or just regenerate the entire file.
Concerning the structure of this PR, I would prefer to have a separate PR for each step.
I suggest renaming this pull request to something like "Make bash completion optional" and opening a new one for filtering records (cf #309).
Firstly, if this PR is about making bash completion optional then the command line switch should have a different name to reflect this. (Oh never mind, I just saw that @apdavison already suggested this in a line comment.)
However, making bash completion optional never was the intention of this PR - it was meant to improve performance. I'm not completely opposed to making it optional, but I think we should avoid introducing lots of options just because we can. If it's cheap in terms of performance to keep the bash completion functionality then why have a switch to disable it? (Again, I'm not sure if it really is cheap, but it sounds to me like the performance-critical parts are unrelated to actually writing to the file .smt/labels
, which is the only part that affects bash completion).
maxalbert is right. Let's keep it enabled. There is a way for fast writing to a file. Can you tell me if you can confirm an improvement of the last commit?
Sorry for re-renaming the title.
I definitely see a speed improvement with @babsey's latest change. In a project with 432 records, smt list
takes 60 seconds with project.format_records()
and 45 seconds with project.record_store.labels()
. I'd be happy to merge this as a first step, although I still think we should break up the writing of .smt/labels
into incremental updates in smt run
and smt delete
.
Generally speaking, 45 seconds for a simple smt list
is still pretty sluggish. If we are consistent with keeping .smt/labels
up to date then we could consider simply returning its contents unless the --long
or --table
options are specified, which would at least make a simple smt list
make much faster. However, at the end of the day this only fixing the symptoms of the recordstore database being slow so ideally we should tackle the root of the issue.
Btw, if project.record_store.labels()
turns out something we intend to use in more places then it could be worth adding a new method project.get_labels()
which calls this under the hood - just to have slightly better encapsulation and looser coupling between classes.
@maxalbert: I agree on all points.
@babsey: please could you squash the commits? Once that's done I can merge this PR, or I could leave it open for further work. Let me know what you would prefer.
I also agree. I wrote a new method get_labels()
for the project object.
Futhermore, I made the next step for improved speed of plain smt list command. This should be a very fast. The question is where I can implement these code lines. I am open-minded for your suggestions.
@apdavison: I never squashed commits but if you agree on these steps. I will try to squash these commits
@babsey I haven't looked at the latest changes yet, but in terms of how to squash commits: I personally really like using git's interactive rebase for this. If you run
git rebase -i master
This will open up an editor which will show you all the commits on your curent branch that are on top of master
, and each of them is prepended by "pick"
. If you keep the first one as "pick"
and change all the subsequent ones to "s"
or"squash"
then this will combine all the commits on top of master into a single one. (Maybe make a backup of your repo before you try it the first time just in case something goes wrong.)
Once you have squashed the commits you will need to push the branch to Github again:
git push --force origin per_impr_CLI
The --force
is important, otherwise git will complain because you are trying to push some changes that don't build upon changes that are already present on Github.
cf discussion on #309 and #310
The concept of this PR is to improve performance for
smt list
. I would like to make this PR step by step. After merging I will continue writing codes for this PR.The first step is to stop the constant writing in file. Next steps will be limiting amount of records for print or filtering by date, script file etc...
Which of these steps do you desire?