Mxrcon / BioNameGenerator

A small project to generate random repository/release/project names based on Biological names and related topics
MIT License
6 stars 0 forks source link

Create a psake task to generate the dictionaries #2

Open abhi18av opened 2 years ago

abhi18av commented 2 years ago

Hi @Mxrcon ,

I noticed that the https://github.com/Mxrcon/BioNameGenerator/blob/main/BioNameGenerator/Databases/Generate-Databases.ps1 is basically a small task to generate the sqlite database from the TSV files.

I was wondering whether it makes sense to

Nothing urgent, but something nice to have as it'll guide the collaborators.

Mxrcon commented 2 years ago

I completely agree Abhinav, I was worried about the TSV's outside the repository, I don't want to distribute the tsv's file within the package, but for development propose I can add the on the root folder of the repository and them create a psake task as you mentioned.

Mxrcon commented 2 years ago

Hey H@abhi18av, The tsv sources are actually modified from different web sources, and some of them had inconsistencies that I had to solve manually, like names with u instead of ü. and cleanups on some unnecessary compound names like Iron-II vs Iron-III.

What do you think about adding TSV files on the root folder of this repository on this structure:

Dictionaries/
├── Generate-Databases.ps1
└── TsvDictionaries
    ├── 28kAdjectives.tsv
    ├── 5kColors.tsv
    ├── Aminoacids.tsv
    ├── Animals.tsv
    ├── BacterialGeneras.tsv
    ├── BacterialSpecies.tsv
    ├── BiologicalBooks.tsv
    ├── BrazilianScientists.tsv
    ├── ChemicalCompounds.tsv
    ├── Colors.tsv
    ├── ComputationKeywords.tsv
    ├── Dictionaries.db
    ├── FieldsWinners.tsv
    ├── LaboratoryKeywords.tsv
    ├── MetalsAndAlloys.tsv
    ├── NF-Adjectives.tsv
    ├── NF-Names.tsv
    ├── NobelLaureates.tsv
    ├── NucleicAcids.tsv
    ├── PeriodicTableElements.tsv
    └── RPGKeywords.tsv

And this Script would be called by the psake task in order to generate a database, and we would have 2 tasks:

  1. Generate database (to use all TSVs and generate a sqlite database)
  2. Update database (Generate a new database and update the Bionamegenerator/Databases/Dictionaries.db

Unfortunately I'm not sure that I'll be able to write a pwsh script able to completely reproduce the process on getting wikipedia pages and them formating them to the TSV's.

Kindly, Davi