Literature, References, Resources, Papers, Links, Links to Libraries etc.

Twenkid commented 1 year ago

Note, 4.1.2023: During this research effort I've been browsing, reviewing, visiting and revisiting, studying a huge amount of articles, concepts,, linked by association during browsing etc. for feeding ideas etc. The best would be to put them in some special representation, DB, semantic network etc.

So far starting with one out of many hundreds or maybe a thousand (so far) - well, a general curiosity, starting from that seed. This is a research & development project on its own, automatic analysis and learning assistant, reading assistant and accelerator, cognitive accelerator etc. An unpublished "in-house" project and experimental application, called [Research] Assistant or ACS in short (Assistant C#) which is a playground and inspiration for ideas and developments in these directions of "Cognitive Acceleration". In a broader sense, any computer and software is such a tool, though.

Various Statistical Similarity methods: https://en.wikipedia.org/wiki/Semantic_similarity A blog on Question Answering etc.: https://queryunderstanding.com/

Twenkid commented 4 months ago

Speech Recognition datasets etc. https://ai.meta.com/blog/voxpopuli-the-largest-open-multilingual-speech-corpus-for-ai-translation-and-more/ https://arxiv.org/abs/2006.13979 https://ai.meta.com/blog/xls-r-self-supervised-speech-processing-for-128-languages/

Language Identification library: tested, use the small model

https://fasttext.cc/docs/en/language-identification.html https://huggingface.co/facebook/fasttext-language-identification

Common Crawl tools

https://github.com/facebookresearch/cc_net

Huge Dataset
https://github.com/togethercomputer/RedPajama-Data ... https://arxiv.org/abs/2007.10310

Twenkid commented 4 months ago

Bulgarian POS-tagger and NER-tagger: Applied https://github.com/AMontgomerie/bulgarian-nlp

https://github.com/AMontgomerie/bulgarian-nlp/blob/master/examples/pos_example.ipynb https://github.com/AMontgomerie/bulgarian-nlp/blob/master/examples/text_annotator_example.ipynb

About the Named-entity tags: https://en.wikipedia.org/wiki/Inside%E2%80%93outside%E2%80%93beginning_(tagging)

Twenkid commented 4 months ago

PHATGOOSE Repository

PHATGOOSE, which stands for Post-Hoc Adaptive Gating Over an Ocean of Specialized Experts, enables zero-shot generalization from specialized experts (eg PEFT modules) trained on diverse datasets by adaptively routing among them. It requires an additional, inexpensive training step of a gate in front of a frozen PEFT module for its corresponding task.

https://github.com/r-three/phatgoose

Twenkid commented 4 months ago

Pyvene

Stanford NLP Python Library for Understanding and Improving PyTorch Models via Interventions

https://github.com/stanfordnlp/pyvene https://arxiv.org/abs/2403.07809

Twenkid commented 3 months ago

Depth map monocular ... depth estimation ... synthetic data, real data ... Depth Anything V2 Lihe Yang1 Bingyi Kang2 † Zilong Huang2
Zhen Zhao Xiaogang Xu Jiashi Feng2 Hengshuang Zhao1 ‡

1HKU 2TikTok † project lead
‡ corresponding author https://depth-anything-v2.github.io/

https://arxiv.org/html/2406.09414v1

Twenkid / Vsy-Jack-Of-All-Trades-AGI-Bulgarian-Internet-Archive-And-Search-Engine

Literature, References, Resources, Papers, Links, Links to Libraries etc. #15

Pyvene