LibriVox / librivox-catalog

LibriVox catalog and reader workflow application
https://librivox.org
MIT License
37 stars 17 forks source link

Reveal keywords on project's catalogue page #225

Open peterjdann opened 4 months ago

peterjdann commented 4 months ago

The effect of the changes proposed here would be:

In compiling this proposal, I have looked for opportunities to develop any relevant unit tests, but have not seen any.

peterjdann commented 4 months ago

I am puzzled by the error message: The migration class \"Migration_Add_keywords_instances_field\" is missing an \"up\" method."

I have the following settings in catalog/application/config/migration.php:

$config['migration_enabled'] = TRUE;
$config['migration_type'] = 'timestamp';
$config['migration_version'] = 20240516002600;
$config['migration_path'] = APPPATH . 'migrations/';

The following are the contents of catalog/application/migrations/20240516002600_add_keywords_instances_field.php

<?php

defined('BASEPATH') OR exit('No direct script access allowed');

class Migration_Add_keywords_instances_field extends CI_Migration {

        public function up()
        {
        $fields = array(
                'instances' => array('type' => 'INT',
            'default' => '1',
            'after' => 'value')
        );
        $this->dbforge->add_column('keywords', $fields);
        }

        public function down()
        {
                $this->dbforge->drop_column('keywords', 'instances');
        }
}
redrun45 commented 4 months ago

I'm glad you're back to work on this! I think the "fix-previous" commits could stand to be squashed together, but this chain of commits is much more linear than the last. I'll let Artom request any changes he wants or needs on that front.

So, I've got some observations on what the code does, and I'll be happy to work with you on what might improve it on that front.

peterjdann commented 4 months ago

Thank you very much for this helpful initial feedback.

I did not know it was possible to “squash” git commits until now, but have just looked up some info on that, and see it is possible. Sorry you’ve having to deal with such a git (and software review) newbie here. Will wait for advice from Artom on that, as you suggest.

I’m away from my development environment for the next four days. I will look more carefully at your other suggestions as soon as I am back. Your explanations all seem very clear to me.

Before I start making any changes, I would like to wait on whatever initial advice Artom may have on how to proceed from here.

Can I assume that where I make changes to address the issues you have raised, I should, as a matter of principle, ’squash’ the commit for each further change with the previous commit affecting the same area of code?

Peter Dann

the On 17 May 2024, at 3:19 PM, redrun45 @. @.> Iom> wrote:

I'm glad you're back to work on this! I think the "fix-previous" commits could stand to be squashed together, but this chain of commits is much more linear than the last. I'll let Artom request any changes he wants or needs on that front.

So, I've got some observations on what the code does, and I'll be happy to work with you on what might improve it on that front.

Migration via php public_html/index.php migrate migrate went just fine! Tags appeared, all with instances at 1 until the cron job first runs. 👍 I wasn't able to call the new cron endpoint (I tried curl -k https://librivox.org/cron/Keywords_stats/update_keywords_stats), until I added an 's' to the controller file's name, to match the class, "Keywords_stats.php". After that, I could update the stats, and tags became click-able! The numbers also look right at first blush, though I'll go back and look for anomalies or off-by-ones at some point. 😉 Clicking on a keyword took me straight to /keywords page, listing projects tagged with that keyword! 👍 It might be nice if this endpoint name was singular to match /author and /reader, but this is me nit-picking. 😝 This endpoint does not seem to reflect the sort order in the drop-down box. I think this comes down to get_projects_by_keywords_id_and_project_type in Project_model.php, where I see a sort order hard-coded. When filtering this page by project type (solo vs group), the pagination links at the bottom still show the total number of pages. Judging by the other controllers, I think we're missing the project type constraint when defining $full_set. — Reply to this email directly, view it on GitHub https://github.com/LibriVox/librivox-catalog/pull/225#issuecomment-2116664807, or unsubscribe https://github.com/notifications/unsubscribe-auth/APUJ53G5VJTKR34QTN3S4DTZCWHOZAVCNFSM6AAAAABHZXA4W2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJWGY3DIOBQG4. You are receiving this because you authored the thread.

redrun45 commented 4 months ago

Can I assume that where I make changes to address the issues you have raised, I should, as a matter of principle, ’squash’ the commit for each further change with the previous commit affecting the same area of code?

I believe that's the way to go - each commit in the final PR should result in working code, even if it isn't used yet. Beyond that, it seems quite subjective, though perhaps I'm missing a convenient guideline.

Myself, I'd put the PHP 8 fix and the configuration change all into the first commit, so that it results in working code that completes one "thing". All the future commits can move on from that to other "things", without changing the code from your first commit unless it's really necessary. This does call for rewriting history (after all, we didn't know the migration had a PHP 8 problem until chronologically late!), but Artom mentioned some tools for helping with that.

peterjdann commented 4 months ago

I have had a go at addressing what I think are the major concerns redrun has expressed so far (though I'm afraid I don't understand redrun's point about the cron job endpoint, and have not addressed that). I have now had a (reasonably brutal!) first person introduction to rewriting a git commit history — the need for such, the means at one's disposal, and the many (many!) pitfalls lying in wait for the beginner. I hope I've made some useful progress in this area too.

redrun45 commented 4 months ago

This looks (to my eye) much, much better! I wanted to send you to a straightforward tutorial and explanation, but the stuff I found mostly made my head hurt, even though I assimilated some of the 'how' of things. I'm glad you also "made it through". 😉

My cron job issue comes down to a test: set the instances counts to something obviously incorrect (say, 1) , and try to trigger the stats update (curl -k https://librivox.org/cron/Keywords_stats/update_keywords_stats is how I tried triggering it). What I saw was HTML in the terminal output, and the instances counts did not update. Is that what you see?

I fixed it by changing the name of the file 'application/controllers/cron/Keyword_stats.php'. Since the class it implements is called 'Keywords_stats' (note the plural), I added an 's' to the filename, and then it worked.

When it comes to updating a commit when there are later ones in the chain, I believe either git rebase or git cherry-pick could save you from needing to re-do the later commits. I'd have to (save several backup copies, and then) play around to find the right sequence of commands and edits.

peterjdann commented 4 months ago

Thanks for that explanation of the cron job issue (which should have been staring me in the face). I've fixed that now, and have pushed that change and all "subsequent" changes here again.

redrun45 commented 3 months ago

Overall, this looks to be working well enough, that I can really take a hammer to it next time I get the chance! :laughing: These are all the notes I've got for now, and it's fine by me if you wait for that additional testing before acting on them.

peterjdann commented 3 months ago

Thank you. These are good suggestions.

I’ll wait to see what a good hammering may throw up before implementing them.

Peter

On 7 Jun 2024, at 2:26 PM, redrun45 @.***> wrote:

Overall, this looks to be working well enough, that I can really take a hammer to it next time I get the chance! 😆 These are all the notes I've got for now, and it's fine by me if you wait for that additional testing before acting on them.

— Reply to this email directly, view it on GitHub https://github.com/LibriVox/librivox-catalog/pull/225#issuecomment-2153910113, or unsubscribe https://github.com/notifications/unsubscribe-auth/APUJ53HQHCSFNWZW4AC4QX3ZGEY7XAVCNFSM6AAAAABHZXA4W2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJTHEYTAMJRGM. You are receiving this because you authored the thread.

redrun45 commented 3 months ago

Well, I've finally sat down to configure my dev box to catalog mock projects again. I need to do a PR to add this to the Ansible scripts sometime. :thinking:

That was the last test I knew we couldn't do on the staging server. If you'd like to implement the changes I previously mentioned, I won't hold you up any longer, and it can go to staging.

For convenience, here are quick summaries of those three things:

Once at least the first two are addressed, we'll ask if Artom can load it up on Staging for more people to play with. Perhaps along with #231, for the full experience! :wink:


Then on a more personal note, I know that text isn't my best medium for building empathy. Code reviews, in particular, seem very "transactional" to me, as they're all focused on business. They'd be harder to read, if they weren't! But that can make them, if not antagonistic, at least a little adversarial, which I know I find frustrating from the relationship side.

I want to say: Thank you! Thank you for caring in the first place to suggest alleviations for the pain-points you see. Thank you for being respectful as you made those suggestions, rather than expecting somebody to "just" do X.

In our forum conversation, I was most explicit in saying that I would not ask someone to work on our code, for reasons that extend beyond the messy code itself. But I'm glad you did. Thank you again for volunteering - and learning, and returning - to work on this anyway.

To make this happen, we needed you. With the amount of new code this took, and the number of other bugs and quirks that have bothered volunteers for years, I don't know if I'd have ever gotten to such a big new feature. Between you and Gareth (and hopefully me), we've made and are making these changes in the past few months, that folks will enjoy for years, even if nobody specifically asked for them to happen right now.

You and I are excited to work on the code, as and when we can. Please be patient with me, as I attempt to save other volunteers (who did not sign up to do code review or QA testing) as much work, worry and stress as possible. Significant updates come with all of those things, for more than just us folks here on GitHub. Even a small update can have unexpected quirks, and we don't want to pressure people to find them before they have to live with them. (See also: #233 :sweat: ) Thank you for your patience so far, and I hope we'll all get to see this live fairly soon!

peterjdann commented 3 months ago

Well, I've finally sat down to configure my dev box to catalog mock projects again. I need to do a PR to add this to the Ansible scripts sometime. 🤔

That was the last test I knew we couldn't do on the staging server. If you'd like to implement the changes I previously mentioned, I won't hold you up any longer, and it can go to staging.

For convenience, here are quick summaries of those three things:

  • Technical - the .gitignore file being removed from the repo
  • Technical - Project_keyword_model.php line 19 concatenating the project id, instead of using a '?'. (Line 22 sets up the DB library so that it will escape and substitute the id in place of the '?', which is the preferred method.)
  • Usability suggestion - in views/catalog/keywords.php, add a header to display the particular keyword we're browsing.

Once at least the first two are addressed, we'll ask if Artom can load it up on Staging for more people to play with. Perhaps along with #231, for the full experience! 😉

Then on a more personal note, I know that text isn't my best medium for building empathy. Code reviews, in particular, seem very "transactional" to me, as they're all focused on business. They'd be harder to read, if they weren't! But that can make them, if not antagonistic, at least a little adversarial, which I know I find frustrating from the relationship side.

I want to say: Thank you! Thank you for caring in the first place to suggest alleviations for the pain-points you see. Thank you for being respectful as you made those suggestions, rather than expecting somebody to "just" do X.

In our forum conversation, I was most explicit in saying that I would not ask someone to work on our code, for reasons that extend beyond the messy code itself. But I'm glad you did. Thank you again for volunteering - and learning, and returning - to work on this anyway.

To make this happen, we needed you. With the amount of new code this took, and the number of other bugs and quirks that have bothered volunteers for years, I don't know if I'd have ever gotten to such a big new feature. Between you and Gareth (and hopefully me), we've made and are making these changes in the past few months, that folks will enjoy for years, even if nobody specifically asked for them to happen right now.

You and I are excited to work on the code, as and when we can. Please be patient with me, as I attempt to save other volunteers (who did not sign up to do code review or QA testing) as much work, worry and stress as possible. Significant updates come with all of those things, for more than just us folks here on GitHub. Even a small update can have unexpected quirks, and we don't want to pressure people to find them before they have to live with them. (See also: #233 😓 ) Thank you for your patience so far, and I hope we'll all get to see this live fairly soon!

I have now attempted to address all three of the technical issues you have (very conveniently - thank you!) summarised here and have pushed the changes through.

On the personal side, I doubt if you can have any idea how glad and thankful I was to receive your very kind, generous and timely message. You have been the soul of discretion, and the soul of helpfulness, through this whole process, for ever acting in what I have long since come to recognise as the true Librivox spirit. If I can live up to that ethos half as well as you do, I will be very glad indeed (😃) - and probably a bit surprised, too!

redrun45 commented 3 months ago

Excellent, I'm glad my nit-picking hasn't driven you away just yet! So far as I'm concerned, the LibriVox spirit is: "volunteers, helping volunteers." I know you've got the same spirit in mind, and I just hope for a reminder whenever I get lost in the weeds myself. 😄

Back to the business side, those changes address everything I know we need for this PR. You mentioned you might make changes to the other one, but I think we can call this one "Ready for Staging"!

peterjdann commented 3 months ago

Excellent, I'm glad my nit-picking hasn't driven you away just yet! So far as I'm concerned, the LibriVox spirit is: "volunteers, helping volunteers." I know you've got the same spirit in mind, and I just hope for a reminder whenever I get lost in the weeds myself. 😄

Back to the business side, those changes address everything I know we need for this PR. You mentioned you might make changes to the other one, but I think we can call this one "Ready for Staging"!

  • In terms of what it does: Looks good to me, but wiser heads than mine should do some further testing.
  • In terms of code: Similarly, if Gareth, Artom or anyone has further suggestions... 😉

Thank you. I'll try to steel myself for whatever may follow 😄! As for "the other one", l have already made further changes to #231 which I'm personally happy with. Those changes, if acceptable for staging yet, might complete the changes here nicely.

peterjdann commented 3 months ago

It's possible that when administrators see this capability in action, they'll be quietly (or not so quietly) horrified at some of the keywords they see are in actual use. It occurs to me that one way to alleviate this horror ever so slightly might be to provide a function that made it easy for administrators quickly to edit those keywords. While it's not necessarily ideal for this purpose, the "add_catalog_item" page does support this functionality, and it's a relatively simple exercise to add to the project catalog page a link, visible only to logged in administrators, that would take them straight to this page for the project concerned. I've mocked that up on my local development system as a proof-of-concept. Hopefully the attached document illustrates properly what I mean. Support easy editing of keywords for existing projects.pdf

redrun45 commented 3 months ago

This link looks like a nice convenience feature in general! It's much more useful for other things than for keywords, though. I did say that updating keywords is far beyond the scope of a PR. My hope all along is that this can be genuinely useful without that. 😅

I hope I haven't created a false impression that updating keywords would be a common thing: in fact the keywords containing ";" are an exceptionally easy case because they only involve the "add_catalog_item" page. I could go into a little bit more detail in a PM some other time. So far, I haven't seen any "can't you just" here on GH, but some folks may come along who don't yet know better.

Perhaps this new link should be its own thing, over in the side-bar. And though I'm sorry to add more for Artom to sort through, I think a separate PR would be appropriate.

peterjdann commented 3 months ago

This link looks like a nice convenience feature in general! It's much more useful for other things than for keywords, though. ... Perhaps this new link should be its own thing, over in the side-bar. And though I'm sorry to add more for Artom to sort through, I think a separate PR would be appropriate.

Good idea - but sorry Artom! Have put this forward now as #234

peterjdann commented 3 months ago

How do we want to handle a situation where an audiobook has what might appear to be on over-long list of keywords associated with it?

One approach would be to truncate the list of keywords we display, showing, say, only the most widely used keywords.

A disadvantage of this approach is that it might bewilder a user who has done an Advanced search for a certain keywords term, the search throws up BOOK X, but BOOK X (which, behind the scenes, turns out to have a long list of keywords) apparently is not associated with this keywords term, to judge by what appears on its catalog page.

This could be confusing.

The change I have just proposed has the effect of truncating a long list of keywords (for example, more than 10) on the catalog page for audiobooks that have such long lists, but adding a "Show full list" link which, if clicked, causes the full list to be revealed. For audiobooks whose list of keywords is within the prescribed limit (here coded as 10) the behaviour is the same as it was previously.

Three screenshots follow.

This audiobook has many keywords defined for it. Accordingly, we truncate the list.

Truncated list

When the user clicks "Show full list", they see the whole list of keywords.

Full list revealed

For a book with a relatively brief list of keywords, the behaviour is as previously coded in this PR. There is no "Show full list" link.

List requiring no truncation
redrun45 commented 3 months ago

We'll see how that feels! I'll wait for that one more additional change we discussed in PM - dropping the 'librivox' and 'audiobooks' keywords, as they are redundant within our own database. Once that's in, we'll ask again to go to staging. 😉

peterjdann commented 3 months ago

Have added a migration action that drops "librivox", "audiobook" and other terms that appear to be intended to have these meanings from projects. If anyone has any tips on how to test a CodeIgniter 3 migration in our particular routing setup before conducting one for real, I'd like to hear about it. I'd also like to know if dbforge (the orthodox tool for writing migration actions) might, in fact, allow one to frame a delete query involving a subquery. Because I wasn't confident it would, I tried a workaround here, loading and invoking a model function.

I do notice, incidentally, the following snippet in the definition of the CI_Migration class that seems to imply dbforge might not be the only way to go when writing migrations:

// They'll probably be using dbforge $this->load->dbforge();

redrun45 commented 3 months ago

If anyone has any tips on how to test a CodeIgniter 3 migration in our particular routing setup before conducting one for real, I'd like to hear about it.

If you mean testing it on your dev box before sending it in as a PR, I use this line (shamelessly copied from Artom's ansible playbook): php public_html/index.php migrate migrate If a migration fails (yours worked just fine!), you can re-run it without entirely resetting the database, by altering migrations.version to an earlier date.

I'd also like to know if dbforge (the orthodox tool for writing migration actions) might, in fact, allow one to frame a delete query involving a subquery. Because I wasn't confident it would, I tried a workaround here, loading and invoking a model function.

Actually, I don't think we have to use dbforge at all! I moved your SQL query from Project_keyword_model straight into the migration file, and it worked just fine. Looks like dbforge is loaded in as a convenience, but we still have the broader range of options, including $this->db->query(). :grin:

peterjdann commented 3 months ago

If anyone has any tips on how to test a CodeIgniter 3 migration in our particular routing setup before conducting one for real, I'd like to hear about it.

If you mean testing it on your dev box before sending it in as a PR, I use this line (shamelessly copied from Artom's ansible playbook): php public_html/index.php migrate migrate If a migration fails (yours worked just fine!), you can re-run it without entirely resetting the database, by altering migrations.version to an earlier date.

I'd also like to know if dbforge (the orthodox tool for writing migration actions) might, in fact, allow one to frame a delete query involving a subquery. Because I wasn't confident it would, I tried a workaround here, loading and invoking a model function.

Actually, I don't think we have to use dbforge at all! I moved your SQL query from Project_keyword_model straight into the migration file, and it worked just fine. Looks like dbforge is loaded in as a convenience, but we still have the broader range of options, including $this->db->query(). 😁

Thank you! That's very helpful. I did a quick (obviously incomplete) dive into the CI source code and wasn't able to convince myself that $this->db would actually be available, and thinking I had no easy way to test (short of pushing the change here) I opted for an alternative approach. But doing the database cull entirely inside the migration action seems cleaner to me, so I've now followed that approach, and have removed the use-once-only method from the project_keyword model.

redrun45 commented 1 month ago

@notartom - at your convenience, I believe this is ready to go to Staging. We might request other changes, but it seems like testing this out is the next step forward. 😀

notartom commented 1 month ago

Deployed on staging as requested!

peterjdann commented 1 month ago

Was struggling today to get on top of the changes to the Migration functions and all the different CONFLICT messages git was throwing at me about this while I was doing multiple rebases and then trying to clean up my commit history. Looks like I've managed to leave things in a screwed up state with respect to that Migration change. I can't do any more on this tonight, but will look at it again tomorrow.

peterjdann commented 1 month ago

I attempted two pushes that address notartom's review comments so far around 14 hours ago. The changes I have made have not, as far as I can see from a quick check, "broken" anything functionally on my local development system, certainly not with respect to the new keywords displaying capability. The first push/deployment failed because I had left some glitchy code stuff in Migration.php. After I cleaned that up, I tried again. The second push/deployment has failed at a website health check. I'm not sure I know how to troubleshoot that from where I'm sitting. Would appreciate advice on how best to proceed from here.

redrun45 commented 1 month ago

The second push/deployment has failed at a website health check. I'm not sure I know how to troubleshoot that from where I'm sitting. Would appreciate advice on how best to proceed from here.

I think I see the trouble: this first commit includes the instruction files for running the two new migrations, but it does not set $config['migration_version'] = 3 in application/config/migration.php, which would tell CodeIgniter to run those migrations. This causes the Page controller to crash when it tries to use the new count column, which hasn't been created.

If you can also remove the change to system/libraries/Migration.php from the same commit, I think the migration is all set without that line. :wink:

peterjdann commented 1 month ago

The second push/deployment has failed at a website health check. I'm not sure I know how to troubleshoot that from where I'm sitting. Would appreciate advice on how best to proceed from here.

I think I see the trouble: this first commit includes the instruction files for running the two new migrations, but it does not set $config['migration_version'] = 3 in application/config/migration.php, which would tell CodeIgniter to run those migrations. This causes the Page controller to crash when it tries to use the new count column, which hasn't been created.

If you can also remove the change to system/libraries/Migration.php from the same commit, I think the migration is all set without that line. 😉

Thanks for your advice, which I've followed as best I can. Deployment, at least, seems to have succeeded now.

redrun45 commented 1 month ago

Excellent!

Oh and... woops. Sorry Artom, I didn't mean do "request a review", I know you'll see this in your own time. I checked this on mobile, and I was expecting a tool-tip or something to tell me exactly what that little +/- icon meant. 😬

peterjdann commented 3 days ago

It concerns me that on the list of Pull requests (https://github.com/LibriVox/librivox-catalog/pulls) this is still showing up as "Changes requested", which seems to imply that the code has been at least partly reviewed, and I have not followed up by making requested changes. This is not the case. When changes were requested more than two months ago, I addressed them pretty promptly — and have heard nothing since.

redrun45 commented 1 hour ago

@peterjdann Honestly, I'm not sure how to change that. All the technical changes we've requested, I believe you're up to date with. The non-technical questions are... well, moving slowly, and I hope to have some news for you on that soon-ish, but don't let the GitHub status bother you if you can help it. :sweat_smile: