jarun / buku

:bookmark: Personal mini-web in text
GNU General Public License v3.0
6.52k stars 294 forks source link

[bukudb] Various improvements #660

Closed LeXofLeviafan closed 1 year ago

LeXofLeviafan commented 1 year ago

fixes #648:

fixes #654:

fixes #652:

fixes #653:

fixes #664 (again):

also:

The changes are split in two commits (return values & refactoring) for better viewing experience.

LeXofLeviafan commented 1 year ago

…I'll re-push these commits with rebase master applied once I make sure the tests actually pass.

LeXofLeviafan commented 1 year ago

…Alright, I'm confused: it says there's still some changes requested, but I only see two threads and both are marked as "resolved" :thinking:

jarun commented 1 year ago

I will need time to review this.

jarun commented 1 year ago

I am done with the review. I think the only difference of opinion here is whether to return -1 or None for indices.

My issues with None are:

@rachmadaniHaryono what's your opinion? Also, please review.

LeXofLeviafan commented 1 year ago

My issues with None are:

Since we're waiting for @rachmadaniHaryono to give an opinion, I'll leave a direct link to the original discussion which starts with my arguments for making it closer to idiomatic Python.

(…Also it just occurred to me that -1 is indeed an idiomatically valid index value in Python, which makes it even worse of a fit for "no index" value for this language.)

I am done with the review.

…I'm still confused about the immutable values: do you want to change them to bools in the API calls (but not in DB), or just to add a comment for the bitmask constants? Note that it doesn't make much sense to keep it an integer in the API since the argument name refers only to one bit of the bitmask (so if you end up adding another flag, it's going to need a separate argument to be used anyway).

jarun commented 1 year ago

immutable values: do you want to change them to bools in the API calls (but not in DB)

This is fine.

jarun commented 1 year ago

@rachmadaniHaryono we are waiting on your input for this.

rachmadaniHaryono commented 1 year ago

replacing BookmarkVar definition with a typed namedtuple

i thought dataclasses maybe better but looking at following SO link, it is maybe better to use namedtuple

https://stackoverflow.com/questions/51671699/data-classes-vs-typing-namedtuple-primary-use-cases

immutable on record

maybe add immutable property on BookmarkVar

class BookmarkVar(NamedTuple):
    ...
    @property
    def immutable(self)->bool:
        ...

[FLAG_NONE, FLAG_IMMUTABLE] = [0x00, 0x01]

i would rather do this because there is no other flag yet

FLAG_NONE = 0
FLAG_IMMUTABLE = 1

i see there is some change on type on docstring

i would rather remove it entirely and just have it on function type hint


big change

my comment on dec 15 2022

https://github.com/jarun/buku/issues/648#issuecomment-1352314207

im undecided on this

the reasoning seem good but it may break other program that use buku

but after checking related buku project, not a lot of program use buku with python

also the next release should be major version, so maybe just go for it?

i havent check bukuserver yet

things changed

on my break i tried to write program to use buku as bookmark manager for my logseq pkm tool

and there is indeed problem especially when the documentation is only on source code

i will explain on next section

also the next release should be major version, so maybe just go for it?

after i think more about it, if we dont add it on the next major release we have to release it on another major release and i dont think it is better.

just put it on one big release

but before next release we should also add more documentation

https://www.youtube.com/watch?v=azf6yzuJt54

i do have some note on example on using buku for developer and will expand on that


for my program i need to get buku record on url and there is no direct function for it

before that i know that to get rec_id, so naively i think this should work

>>> import buku
>>> bdb = buku.BukuDb()
rec = bdb.get_rec_by_id( bdb.add_rec('http://example.com') )

this is wrong when url is already on buku

correct solution should be

>>> url = 'https://example.com'
>>> rec_id = bdb.add_rec(url)
... if rec_id == -1:
...     rec_id = bdb.get_rec_id(url)
>>> bdb.get_rec_by_id(rec_id)

if operation return None, this could be easier

rec = bdb.get_rec_by_id(bdb.add_rec(url) or bdb.get_rec_id(url))

this also show that if a function require int parameter we could check for None value and raise error

other solution is to create function get_or_create_rec(url)

similar function

https://docs.djangoproject.com/en/4.2/ref/models/querysets/#get-or-create

jarun commented 1 year ago

@LeXofLeviafan please proceed as suggested by @rachmadaniHaryono.

jarun commented 1 year ago

@rachmadaniHaryono can you please list all the changes made in the bukuserver side since the last release?

rachmadaniHaryono commented 1 year ago

wait, @jarun so we can change it to return None?

if not clear, i agree with @LeXofLeviafan to change return value from -1 to None

e:

list all the changes made in the bukuserver side since the last release

i will add that as task on https://github.com/jarun/buku/pull/657

jarun commented 1 year ago

Yes, it's OK to change it to -1, I lost 2:1. ;)

Please add the list to https://github.com/jarun/buku/issues/484 under Cooking.

rachmadaniHaryono commented 1 year ago

another thing for @LeXofLeviafan

there is https://docs.python.org/3/library/collections.html#collections.somenamedtuple._asdict which could simplify namedtuple to dict on some function

LeXofLeviafan commented 1 year ago

[FLAG_NONE, FLAG_IMMUTABLE] = [0x00, 0x01]

i would rather do this because there is no other flag yet

FLAG_NONE = 0
FLAG_IMMUTABLE = 1

The reason I'm using hexadecimal notation is specifically to make it clear that it's a bitmask (i.e. the decimal values don't actually matter, only enabled bits are); not to mention this notation also makes it clear which bits are enabled in the bitmask.

there is https://docs.python.org/3/library/collections.html#collections.somenamedtuple._asdict which could simplify namedtuple to dict on some function

That is indeed a convenient method… though it's not actually that much useful in this case, seeing as BookmarkVars contain raw DB data instead of the data format used in JSON export & bukuserver API (i.e. tags are a string wrapped in commas, and immutable is hidden within flags… though the latter isn't exported by buku, it'd be present in _asdict anyway).

maybe add immutable property on BookmarkVar

…I'm suddenly getting an urge to rename tags to _tags, and create two properties: tags that matches _tags[1:-1] (which is used way too often in the source, let's be honest), and tagslist that matches [x for x in bookmark.tags.split(',') if x] (this one is less prominent but certainly is convenient to have).

LeXofLeviafan commented 1 year ago

Also, @jarun – there was a discussion in #665 regarding the import of readline/pyreadline3 (which affects I/O properties of the program, and therefore should not be done when buku is used as a library – as opposed to running it as a CLI program).

I suggested moving the import to the beginning of main(), and @rachmadaniHaryono suggested that I add the change to this pull request. Do you have any problem with that?

LeXofLeviafan commented 1 year ago

Note: did a rebase.

LeXofLeviafan commented 1 year ago

i see there is some change on type on docstring

i would rather remove it entirely and just have it on function type hint

That makes sense, I suppose (type hints are included in the function docs – at least the ones produced by help()), but the types are listed in docstrings pretty much everywhere so they should probably be changed everywhere; I'd rather leave them alone for now, and maybe refactor the docstrings on a separate pull request or something. …Also, @jarun – your opinion on that?

jarun commented 1 year ago

I suggested moving the import to the beginning of main(), and @rachmadaniHaryono suggested that I add the change to this pull request. Do you have any problem with that?

This is fine.

Also, @jarun – your opinion on that?

You can leave it for now.

BTW, once this is in, can we think about some good features? One long-standing request has been to add urls to the wayback machine but I didn't find any simple way to do that without adding tons of deps.

jarun commented 1 year ago

@LeXofLeviafan please document the API changes in https://github.com/jarun/buku/issues/484 so that it is available for third-party utilities.

LeXofLeviafan commented 1 year ago

One long-standing request has been to add urls to the wayback machine but I didn't find any simple way to do that without adding tons of deps.

This one?

Browse a bookmark (possibly dead URL) on Wayback machine

It seems to be marked as done, for some reason (or what does checked checkbox mean in the TODO list?) :sweat_smile:

As for "how", I'm guessing querying their API is an option. (…Incidentally, this page also mentions a "Memento API", which appears to be intended for other sites with similar purpose; maybe this should be looked into as well/instead)

jarun commented 1 year ago

Looking up on wayback machine is available. Not adding to it. I was looking for a solution with minimum deps.

LeXofLeviafan commented 1 year ago

a solution with minimum deps

json.load( urllib.request.urlopen(…) )?

LeXofLeviafan commented 1 year ago

…Now that I re-read it, you seem to be talking about submitting a page to webarchive. That seems to be possible as well, though, also by using their API (i.e. making a request to the /save endpoint).

Alternatively (if that doesn't work out, for whatever reason), waybackpy could me made an optional dependency.

LeXofLeviafan commented 1 year ago

immutable values: do you want to change them to bools in the API calls (but not in DB)

This is fine.

Incidentally, what about CLI? Should the argument parameter be kept as 1/0, or changed to something more conventional like y/n?

LeXofLeviafan commented 1 year ago

…Say

On Linux, globally installed certifi appears to default to /etc/ssl/certs/ca-certificates.crt if it's available; and a virtualenv install probably shouldn't be accessing global files anyway. Should we maybe simply use certifiy.where() unconditionally, instead of this quirky-looking logic?

if sys.platform.startswith('linux') and os.path.isfile('/etc/ssl/certs/ca-certificates.crt'):
    CA_CERTS = '/etc/ssl/certs/ca-certificates.crt'
else:
    import certifi
    CA_CERTS = certifi.where()
LeXofLeviafan commented 1 year ago

…Speaking of adding features – it would make sense to support importing data exported with --json (at least when no fields are filtered out).

LeXofLeviafan commented 1 year ago

…Alright, this seems to be all (except for possibly CLI changes for --immutable in case they're accepted as well).

And yes, the field filtering code is rather unintuitive, so I refactored it too.

This was added to the "also" list:

  • moved readline import into main() (to avoid loading it when buku is used as a library)
  • fixed sys_platform dependency condition to match the possible values set
  • changed field filtering implementation to declarative + updated JSON generation to adhere to --format description

P.S. Also updated the branch with bukuserver functional tests – should be ready for a pull-request after this one is merged.

LeXofLeviafan commented 1 year ago

~Edit: fix for #666~

LeXofLeviafan commented 1 year ago

~Edit: fixed flask-reverse-proxy-fix version in bukuserver/requirements.txt~

LeXofLeviafan commented 1 year ago

…On second thought, let's do those in a separate pull-request :sweat_smile:

jarun commented 1 year ago

@rachmadaniHaryono please confirm if this is ready to merge.

rachmadaniHaryono commented 1 year ago

i have seen this several time

{
'id': bookmark.id,
'url': bookmark.url,
'title': bookmark.title,
'tags': bookmark.taglist,
'description': bookmark.desc,
}

maybe extend _asdict so it can be done in single function

for example

# from
res = jsonify({
  'url': bookmark.url,
  'title': bookmark.title,
  'tags': bookmark.taglist,
  'description': bookmark.desc,
})
# to
res = jsonify({bookmark._asdict())
LeXofLeviafan commented 1 year ago

…Like I said, that's not what _asdict() will produce on named-tuples of this type. The result will have keys id, url, title, tags_raw, desc and flags. Even if you don't care about raw DB fields being included, the properties will not be in it. Also, the keys for some fields in bukuserver API differ from keys in JSON generated by buku; so the best we can do here is make a separate function for generating a bukuserver API entity out of a BookmarkVar (in bukuserver/api.py).

LeXofLeviafan commented 1 year ago

Note: did a rebase, fixed typo, minor refactoring.

LeXofLeviafan commented 1 year ago

Smh the local check missed unused imports :sweat_smile:

rachmadaniHaryono commented 1 year ago

…Like I said, that's not what _asdict() will produce on named-tuples of this type. The result will have keys id, url, title, tags_raw, desc and flags. Even if you don't care about raw DB fields being included, the properties will not be in it. Also, the keys for some fields in bukuserver API differ from keys in JSON generated by buku; so the best we can do here is make a separate function for generating a bukuserver API entity out of a BookmarkVar (in bukuserver/api.py).

i thought _asdict could be simpler like this

def _asdict(self):
    return {k: v for k, v in self._asdict().items() for k in ('url', 'title', ...)}

but i do like entity function implementation better


this one is optional. is it ok to use id as parameter? it is not reserved word but it is built in function

https://stackoverflow.com/a/6350862

i will approve it for now

LeXofLeviafan commented 1 year ago

this one is optional. is it ok to use id as parameter? it is not reserved word but it is built in function

Well it's used that way already in the code, both as a parameter and as a local variable. Anyway, there's no usecases for id() in the codebase, so overriding it locally shouldn't cause any issues.

jarun commented 1 year ago

Thank you!

thetechnodino commented 1 year ago

Thanks to the help I received from you guys, I was able to get the software installed and working.

Thanks

John