bitextor / warc2text

Extracts plain text, language identification and more metadata from WARC records
MIT License
20 stars 5 forks source link

Release versions #38

Closed jelmervdl closed 1 year ago

jelmervdl commented 1 year ago

Hi @lpla ,

Would you be okay with starting to tag releases? It would making installing and selecting specific versions for use [in projects]( (github.com/hplt-project) easier.

Honestly I would just be happy with v1, v2, v3 etc. anything that makes it pinnable. Semver is also okay.

lpla commented 1 year ago

Hi, Jelmer.

As I will be out of the active development of this tool quite soon, probably @mespla @cgr71ii or @aarongaliano should show if they can compromise with this request.

jelmervdl commented 1 year ago

I noticed I still have write access to this repo, so I can do it myself. But I'd love to have Alicante's input on what type of versioning they'd like to stick to.

lpla commented 1 year ago

I would say semver, as it is the way we are doing code releases in the whole Bitextor organization (Bifixier, Bicleaner, Biroamer...). Let's see what people say.

cgr71ii commented 1 year ago

Hi!

We're ok with tagging releases, so go ahead. Once done, I'd say we should run Bitextor's tests just in case, and then close the issue.

I think semver is also the correct approach in order to be consistent with the other tools as lpla mentioned.

ZJaume commented 1 year ago

shall we start with 1.0.0?

jelmervdl commented 1 year ago

I suggest we make 673e3717288a78ee4af65d71462d43f3ec1ed3bb v1.0.0 (I think that was the one that was used by Oslo in HPLT) and eac887ee613954fdea7b5d8e46de36912f0c5f7e v1.1.0 since it added the unk group for CLD2 and fasttext support.

jelmervdl commented 1 year ago

They have been tagged :tada:

akutuzov commented 1 year ago

Hmm, only tags, no actual releases?

jelmervdl commented 1 year ago

Now they're also actual releases.

akutuzov commented 1 year ago

Aha, now they are :)