termux / termux-language-server

🛠️ A language server for some specific bash scripts
https://termux-language-server.readthedocs.io/
GNU General Public License v3.0
71 stars 7 forks source link

List of known licenses is wrong for Gentoo / generate schema / Sort variable names #19

Closed powerman closed 2 months ago

powerman commented 2 months ago

For example, here are list of GPL licenses allowed by termux-language-server while checking *.ebuild:

GPL-1.0-only
GPL-1.0-or-later
GPL-2.0-only
GPL-2.0-or-later
GPL-2.0-with-autoconf-exception
GPL-2.0-with-classpath-exception
GPL-2.0-with-font-exception
GPL-2.0-with-GCC-exception
GPL-3.0-linking-exception
GPL-3.0-linking-source-exception
GPL-3.0-only
GPL-3.0-or-later
GPL-3.0-with-autoconf-exception
GPL-3.0-with-GCC-exception
GPL-CC-1.0

but here are actual Gentoo GPL licenses (output of (cd /usr/portage/licenses/ && ls -1 GPL-*)):

GPL-1
GPL-1+
GPL-2
GPL-2+
GPL-2-with-classpath-exception
GPL-2+-with-eCos-exception-2
GPL-2-with-exceptions
GPL-2-with-font-exception
GPL-2-with-linking-exception
GPL-2-with-MySQL-FLOSS-exception
GPL-2+-with-openssl-exception
GPL-2+-with-Pyinstaller-Bootloader-exception
GPL-3
GPL-3+
GPL-3+-with-autoconf-exception
GPL-3-with-font-exception
GPL-3+-with-font-exception
GPL-3-with-openssl-exception

as result I get this error:

$ grep LICENSE termux-language-server-0.0.23.ebuild
LICENSE="GPL-3"
$ termux-language-server --check termux-language-server-0.0.23.ebuild
termux-language-server-0.0.23.ebuild:15:10-15:15:error: 'GPL-3' does not match '(389-exception|Abstyles|CDL-1.0|DOC|...
powerman commented 2 months ago

Not sure is this a bug or not and if yes then is it related to previous one, but I also get few other false positives after license-related error:

$ termux-language-server --check termux-language-server-0.0.23.ebuild
termux-language-server-0.0.23.ebuild:15:10-15:15:error: 'GPL-3' does not match '...
termux-language-server-0.0.23.ebuild:16:1-16:5:warning: SLOT: is unsorted due to LICENSE@termux-language-server-0.0.23.ebuild:15:1-15:7
termux-language-server-0.0.23.ebuild:17:1-17:9:warning: KEYWORDS: is unsorted due to LICENSE@termux-language-server-0.0.23.ebuild:15:1-15:7
termux-language-server-0.0.23.ebuild:31:1-31:2:warning: S: is unsorted due to DESCRIPTION@termux-language-server-0.0.23.ebuild:4:1-4:11

You can see the whole ebuild at https://github.com/powerman/powerman-overlay/blob/master/dev-util/termux-language-server/termux-language-server-0.0.23.ebuild

powerman commented 2 months ago

Few more issues. I can open separate issues, if you like, but I'm unsure is these features are supposed to work at all in current version.

Freed-Wu commented 2 months ago

Is there any document about license names for Gentoo?

Freed-Wu commented 2 months ago

it was unable to find and process existing file /usr/share/man/man5/ebuild.5.bz2.

It use platfromdirs.site_data_path() and platfromdirs.user_data_path() to get the path. Perhaps you use XDG_* to change its behaviour?

Refer https://github.com/platformdirs/platformdirs to know more.

Freed-Wu commented 2 months ago

on each run produces random order of variables.

--format will sort the order of some variables. However, it is not fully finished. Current behaviour is just exchange some variables' order. :cry:

powerman commented 2 months ago

it was unable to find and process existing file /usr/share/man/man5/ebuild.5.bz2.

It use platfromdirs.site_data_path() and platfromdirs.user_data_path() to get the path. Perhaps you use XDG_* to change its behaviour?

No, I didn't changed, but that's not the point. This is a system file, not a user one - why are you looking for it in a user-specific directories at all?

powerman commented 2 months ago

Is there any document about license names for Gentoo?

If you're looking for a document then you can check links to each license group mentioned in https://wiki.gentoo.org/wiki/License_groups. But I suppose it may be easier for you to just take all allowed license names from file names in the repo: https://github.com/gentoo/gentoo/tree/master/licenses

powerman commented 2 months ago

Also, it may make sense to check actually available licenses on user's system in $(portageq get_repo_path / gentoo)/licenses/. Not sure is it good idea or not… Unlikely language server will be used to handle ebuild files on a system without Gentoo installed, but it's still possible. Also it's unclear how often license list changes (and even if it change new licenses unlikely will be widely used, so updating list of known licenses in language server may not be an urgent task).

Freed-Wu commented 2 months ago

This is a system file, not a user one

What is the result in your machine:

from platformdirs import site_data_dir
site_data_dir('man')
Freed-Wu commented 2 months ago

https://github.com/gentoo/gentoo/tree/master/licenses

It is /usr/portage/licenses, right?

Freed-Wu commented 2 months ago

GPL-1.0-only GPL-1.0-or-later GPL-2.0-only

These are SPXD license names. Gentoo doesn't support them, right?

Freed-Wu commented 2 months ago

Also it's unclear how often license list changes (and even if it change new licenses unlikely will be widely used, so updating list of known licenses in language server may not be an urgent task).

Agree :+1:

powerman commented 2 months ago
$ python
Python 3.12.3 (main, Jun  8 2024, 17:58:56) [GCC 13.2.1 20240210] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from platformdirs import site_data_dir
>>> site_data_dir('man')
'/usr/local/share/man'
>>> 

This is correct site data dir. But man pages for OS packages are never installed in site dir, they're in system dir /usr/share/man.

It is /usr/portage/licenses, right?

It depends on user configuration. 20 years ago when I've installed Gentoo it was /usr/portage/licenses/ by default. Several years ago default was changed to /var/db/repos/gentoo/licenses/. Plus user can change this. So don't guess, instead call the tool which will give you a correct answer:

$ portageq get_repo_path / gentoo
/var/db/repos/gentoo
$ ls $(portageq get_repo_path / gentoo)/licenses/ | head
0BSD
2dboy-EULA
9wm
Activision
aczoom
adom
AFL-2.1
AFL-3.0
AGPL-3
AGPL-3+
$

It can even give you a path to license file, but I doubt this will be useful for you:

$ portageq license_path / gentoo GPL-3
/var/db/repos/gentoo/licenses/GPL-3
$ portageq license_path / gentoo GPL-nosuch

$

This is a standard tool (it comes with "portage" package, so every Gentoo user is guaranteed to have it installed), but it's not fast so avoid calling it more than once:

$ time portageq get_repo_path / gentoo
/var/db/repos/gentoo

0.162 real  0.131 user  0.031 sys  33MB RAM

These are SPXD license names. Gentoo doesn't support them, right?

No idea what's SPXD and what Gentoo does with them, sorry.

Freed-Wu commented 2 months ago

SPXD

Sorry, it is SPDX https://spdx.org/licenses/

site_data_dir('man') '/usr/local/share/man'

In my machine, there are /usr/share/man. Except:

$ XDG_DATA_DIRS=/usr/local/share ptpython
from platformdirs import site_data_dir
site_data_dir('man')
'/usr/local/share/man'

That's why I ask if you set $XDG_DATA_DIRS

powerman commented 2 months ago

Hmm. Maybe something is misconfigured on your machine then. Look, here is another system, this time very usual Ubuntu in a cloud:

# python3
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from platformdirs import site_data_dir
>>> site_data_dir('man')
'/usr/local/share/man'
>>> 
# env | grep XDG
XDG_SESSION_TYPE=tty
XDG_SESSION_CLASS=user
XDG_SESSION_ID=1
XDG_RUNTIME_DIR=/run/user/0

Here are envs from my Gentoo:

$ env | grep XDG
XDG_CONFIG_DIRS=/etc/xdg
XDG_SEAT=seat0
XDG_SESSION_TYPE=x11
XDG_SESSION_CLASS=greeter
XDG_VTNR=7
XDG_SESSION_ID=1
XDG_RUNTIME_DIR=/var/run/user/1000
XDG_DATA_DIRS=/usr/local/share:/usr/share
$ rg XDG_DATA_DIRS /etc/env* /etc/profile*
/etc/env.d/30xdg-data-local
1:XDG_DATA_DIRS="/usr/local/share"
2:COLON_SEPARATED="XDG_DATA_DIRS XDG_CONFIG_DIRS"

/etc/env.d/90xdg-data-base
1:XDG_DATA_DIRS="/usr/share"

/etc/environment.d/10-gentoo-env.conf
33:XDG_DATA_DIRS=/usr/local/share:/usr/share

/etc/profile.env
35:export XDG_DATA_DIRS='/usr/local/share:/usr/share'

I didn't changed these settings, everything was set per Gentoo defaults.

Maybe the actual issue isn't in using $XDG_DATA_DIRS but in checking only first dir included in this var instead of all of them?

powerman commented 2 months ago

BTW, the spec https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html says:

If $XDG_DATA_DIRS is either not set or empty, a value equal to /usr/local/share/:/usr/share/ should be used.

Which probably means single value returned by site_data_dir('man') on Ubuntu above is not suitable for searching for man pages in all data dirs.

Freed-Wu commented 2 months ago

Which probably means single value returned by site_data_dir('man')

We can use site_data_dir('man', multipath=True). I'll fix it.

Freed-Wu commented 2 months ago

SPDX https://spdx.org/licenses/

Can gentoo support it?

powerman commented 2 months ago

I don't think so. Here is authoritative decision about it:

https://devmanual.gentoo.org/general-concepts/licenses/index.html#adding-new-licenses When adding a new license, it should be checked whether there is an established name in the SPDX license list. However, Gentoo does not consider the Software Package Data Exchange to be an authoritative standard, so we may sometimes deviate from it, e.g., if their "short identifier" is excessively long. Also, we generally don't rename our existing identifiers.

Freed-Wu commented 2 months ago

OK, I see:

powerman commented 2 months ago

on each run produces random order of variables.

--format will sort the order of some variables. However, it is not fully finished. Current behaviour is just exchange some variables' order. 😢

Is there a way to disable this behaviour (for LSP mode) until it's will be properly implemented (by forcing variable order required by Gentoo)?

Freed-Wu commented 2 months ago

I think don't use format action is okay?

powerman commented 2 months ago

Well… I'm using neovim/nvim-lspconfig and stevearc/conform.nvim (actually it's a neovim kickstart config). They both don't have own setup for termux-language-server. I've added configuration from your doc https://termux-language-server.readthedocs.io/en/latest/resources/configure.html#neovim - and it does reformat on *.ebuild file save. So, I don't "use format action" intentionally and have no idea how to disable it just for termux-language-server (I suppose I can disable whole conform plugin for some file types but I still wanna use, say, shfmt for ebuilds, so I don't want to disable all formaters).

Freed-Wu commented 2 months ago

have no idea how to disable it just for termux-language-server

I believe neovim should have the method to do it?

Currently, if we hope the order is pkgname, pkgver, pkgrel.

pkgver=0.0.1
pkgrel=1
pkgname=shm

After we execute format, it notice pkgname should be before pkgver, so it exchanage their position:

pkgname=shm
pkgrel=1
pkgver=0.0.1

Then we execute format again, it notice pkgver should be before pkgrel, so it exchanage their position:

pkgname=shm
pkgver=0.0.1
pkgrel=1

Now, the order is correct, execute format will do nothing.

In design, we should only execute format one time. However, it haven't be realized, we have to execute many times format action to get the final result.

powerman commented 2 months ago

I believe neovim should have the method to do it?

As far as I can tell at least some language servers provide a custom option for this. Examples: cds_lsp, clangd uses config file, cssls (and several other vscode ls uses the same), dockerls allows to configure formatter, and so on…

Anyway, it wasn't easy, but I've figured out how to disable formatting:

vim.api.nvim_create_autocmd('LspAttach', {
    callback = function(args)
        local client = vim.lsp.get_client_by_id(args.data.client_id)
        if client and client.name == 'termux' then
            client.server_capabilities.documentFormattingProvider = nil
        end
    end,
})

Now, the order is correct, execute format will do nothing.

Actually, it really stops reordering at some point, but the final order is wrong. :disappointed: If we'll use termux-language-server-0.0.24.ebuild as an example, then final order will became this:

# Copyright 2024 Gentoo Authors
# Distributed under the terms of the GNU General Public License v2

S=${WORKDIR}/${PN//-/_}-${PV}

DISTUTILS_USE_PEP517=setuptools
PYTHON_COMPAT=(python3_{10..12})

inherit distutils-r1 pypi

DESCRIPTION="A language server for some specific bash scripts"
EAPI=8
SRC_URI="$(pypi_sdist_url "${PN//-/_}")"

HOMEPAGE="https://github.com/termux/termux-language-server https://pypi.org/project/termux-language-server/"

KEYWORDS="~amd64"
SLOT="0"
LICENSE="GPL-3"

DEPEND="
    dev-python/fqdn[${PYTHON_USEDEP}]
    dev-python/lsp-tree-sitter[${PYTHON_USEDEP}]
    dev-python/platformdirs[${PYTHON_USEDEP}]
    dev-python/rfc3987[${PYTHON_USEDEP}]
    dev-libs/tree-sitter-bash[${PYTHON_USEDEP}]
    dev-python/license-expression[${PYTHON_USEDEP}]
    >=dev-python/tree-sitter-0.22.0[${PYTHON_USEDEP}]
"
RDEPEND="${DEPEND}"

This is clearly wrong just because EAPI isn't the first one. So, the result is:

$ pkgcheck scan
dev-util/termux-language-server
  SourcingError: version 0.0.24: failed sourcing ebuild: distutils-r1: EAPI 0 not supported, (distutils-r1.eclass, line  49:  called die)

If we move EAPI on top, then there are other order issues too:

$ pkgcheck scan
dev-util/termux-language-server
  VariableOrderWrong: version 0.0.24: variable DESCRIPTION should occur before S
  VariableOrderWrong: version 0.0.24: variable HOMEPAGE should occur before SRC_URI
  VariableOrderWrong: version 0.0.24: variable LICENSE should occur before SLOT
  VariableOrderWrong: version 0.0.24: variable SLOT should occur before KEYWORDS

Example with 100% valid order: https://github.com/gentoo/gentoo/blob/master/skel.ebuild Probably there are some more comprehensive doc about order of other possible variables.

Freed-Wu commented 2 months ago

Anyway, it wasn't easy, but I've figured out how to disable formatting:

:+1:

This is clearly wrong

the order is gotten from man 5 ebuild https://dev.gentoo.org/~zmedico/portage/doc/man/ebuild.5.html

powerman commented 2 months ago

the order is gotten from man 5 ebuild

No, this source does not define any recommended order of variables.

Actually, it turns out to be uneasy to find actual recommendations about the order. Here is what I know:

But the thing is, each and every related doc mention we have to use pkgcheck: Basic guide to write Gentoo Ebuilds, Gentoo git workflow, Standard git workflow, ... So, it looks like your tool should follow rules used by pkgcheck to be helpful.

And here is related pkgcheck doc:

https://pkgcore.github.io/pkgcheck/man/pkgcheck.html#variableordercheck Scan ebuilds for variables defined in a different order than skel.ebuild dictates.

So, it looks like the only variables with forced order are listed in skel.ebuild, plus there is one additional rule: EAPI must be the first variable in a file. Any other variables may be anywhere below EAPI, they may be interleaved with ones listed in skel.ebuild.

Freed-Wu commented 2 months ago

In fact, the feature to sort the variable's order is not only for Gentoo's ebuild, also for Android termux's build.sh and ArchLinux's PKGBUILD.

For build.sh, the order is decided by https://github.com/termux/termux-packages/wiki/Creating-new-package#table-of-available-package-control-fields For PKGBUILD, the order is decided by man PKGBUILD. For ebuild, the order is decided by man ebuild. It uses same code as PKGBUILD.

Perhaps gentoo can give an official recommanded order and use the order to sort man ebuild?

powerman commented 2 months ago

For ebuild, the order is decided by man ebuild.

No, it's not. At least I never read official statements about this.

Perhaps gentoo can give an official recommanded order

It's already given in skel.ebuild.

and use the order to sort man ebuild?

Well, you can propose that to Gentoo developers.

In any case specific order is required only for variables included in skel.ebuild, for all other vars some specific order is not defined/required.