Open synctext opened 1 year ago
In the last three weeks I was occupied with the re-doing of the stochastic calculations of the chunking algorithm AE, and its "cousin" RAM. The purpose of this analysis is deriving a formula in order to understand the relationship between their parameter $h$ and the expected mean chunk size $\mu$. This enables us to conduct a comparative study between all algorithms. ~added 3 page math appendix A
We're approaching 40 pages now, and thinking how to sell this work with @grimadas; Ideas of breaking this work into two papers: one survey/SoK theoretical paper, and one about the empirical study. Possible target: JSys 1st August
Draft thesis title for forms: Decentralized Machine Learning Systems for Information Retrieval
< Placeholder > timeline: April 2023 - April 2027. ToDo: 6 weeks hands-on Python onboarding project. Learn a lot and plan to throw it away. Next step is then to start working on your first scientific article. You need 4 thesis chapters and you then completed your phd.
One idea: towards trustworthy and perfect metadata for search (e.g. perfect memory retrieval for the global brain #7064 ). Another idea: Gradient decent model takes any keyword search query as input. Output is a limited set of vectors. Only valid content is recommended. Learning goal is to provide semantic matching between input query and output vector. General background and Spotity linkage possible dataset sharing
Max. usage of expertise: product/market fit thinking
Background reading:
pointwise approach, broadly speaking, each historic impression with a click is a positive training example, and each impression without a click is a negative training example.
https://towardsdatascience.com/learning-to-rank-a-primer-40d2ff9960afWe will implement a character-level sequence-to-sequence model, processing the input character-by-character and generating the output character-by-character. Another option would be a word-level model, which tends to be more common for machine translation.
, https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.htmlVenues:
Possible scientific storyline: SearchZero a decentralised, self-supervised search engine with continuous learning