Closed synctext closed 2 weeks ago
Hello world
Hello world
Hello world, Ricardo here!
hello world, this is esmee, over
Hi, Is it possible to provide us with a type / format of training data that we can get from Superapp? What are the features we can get and in which format? Do we understand correctly that there is no preprocessing done on user data yet? Thanks in advance!
Stuff we need to do/find out:
Tasks 1) getting music info from the app @ehildebrand 2) finding a dataset 3) feature vectors from music and online svm @nata1y 4) gossiping models @JCBrouwer 5) predictions with model in interface @ehildebrand 6) like button in interface @ehildebrand
https://github.com/JCBrouwer/fedrecsys
Pros:
Cons:
A brief overview of current findings: We are going to create a separate Android library dedicated to federated ML. We are going to use Online Learning interface for training models in place from smile Kotlin library (https://haifengl.github.io/quickstart.html). We are going to implement models from the paper: Pegasos svm and adaline. Not clear yet on concrete features encoding/engineering technique, depends on the type of data we can get from the user, to be discussed during the next meeting..
These papers both propose more efficient gossiping by compressing parts of training intelligently. These can be useful to implement almost regardless of the exact recommendation models we choose to use. The second has a high-quality open-source implementation we can use as reference (+ the authors have a slack where they are reachable).
Once we have basic feature-based recommendations and a like button in the interface, we can think of bootstrapping some network-based collaborative filtering. This can be based on an assembly of info (music features, listening history, and likes) to find similar users so we can recommend based on similarity. The first paper and last 2 papers go over how this can be done in a privacy-preserving way.
Many approaches use a centralized parameter server to keep global updates. It seems plausible that a true central server could be replaced by a parameter server running locally on each device. These parameter servers could then share their weight updates via gossiping.
Possible gameplan for this entire project:
For next sprint: make progress on the above issues and report progress for next meeting.
EDIT: plus this plug-in system for "dApp" approach work fully, but usability is -100%, https://github.com/Tribler/trustchain-superapp/blob/master/freedomOfComputing/README.md Bonus extension: integrate a compiler inside the superapp, to enable source code distribution and compile locally: https://play.google.com/store/apps/details?id=ru.iiec.jvdroid&hl=en_GB&gl=US (less security hazards, source code inspection) EDIT2 (for future reference): Java compiler on Android: https://github.com/t-arn/java-ide-droid, heroic efforts for memory overflow, and https://play.google.com/store/apps/details?id=com.krazeapps.kotlinprogrammingcompiler
Two suggestions from @drew2a side
As we are "production software" now, I suggest performing all the development of the Superapp in a separate branch to prevent interfering with a published code from the master
branch.
To do this, we need to create a branch in https://github.com/Tribler/trustchain-superapp (e.g. feature/music-dao-recommendations
). Then, all of our commits will go to this branch. At the end of the project, this branch will be merged to master
.
If you are not familiar with git
, take a look at https://www.gitkraken.com/
It will be tough to debug and test a changed application, within the wild network. There are multiple scenarios of how we can make life easier for ourselves.
I propose the following approach:
@drew2a we can seed this content for you: 917 GiB and 343 days of Creative Commons-licensed audio from 106,574 tracks from 16,341 artists and 14,854 albums, arranged in a hierarchical taxonomy of 161 genres and expand it with a crawl of 55,000 full audio tracks of Jamendo
Wow :)
Use a separated branch
As we are "production software" now, I suggest performing all the development of the Superapp in a separate branch to prevent interfering with a published code from the
master
branch.
Yes we're developing in branches on this fork. We can merge branches from there directly to the upstream master via pull request.
I propose the following approach:
1. Deploy a dedicated bootstrap server (or a few). I can help with that. 2. Change (or override) a list of bootstrap servers (see an example [here](https://github.com/Tribler/kotlin-ipv8/commit/32a7286e20c254bd747be47c063fbcbdfe93d532))
Ok sweet, that sounds good.
@drew2a we can seed this content for you: 917 GiB and 343 days of Creative Commons-licensed audio from 106,574 tracks from 16,341 artists and 14,854 albums, arranged in a hierarchical taxonomy of 161 genres and expand it with a crawl of 55,000 full audio tracks of Jamendo
Who sent you this? That definitely sounds like it can help our issues with amount of data!
Who sent you this? That definitely sounds like it can help our issues with amount of data!
Johan sent me this
Who sent you this? That definitely sounds like it can help our issues with amount of data!
Johan sent me this
Ahh I read over it, nice!
Andrei told me you can find the data storage here
Main updates: We are in the process of developing a federated ml service for music dao. Currently facing some issues upon trying to integrate it with ipv8 and possibly would like to discuss it during the meeting. We have changed ml model such that we are not using a feature-based approach anymore. However, due to the limitations of a size of a model that we can send over the net, we likely do not want to store all possible song-song combinations in it (with the corresponding similarity value/weight). Thus, the idea is to choose a predetermined set of songs (most popular? just randomly chosen ones for now?). Is it an acceptable approach?
Discussed issues:
Current progress:
Refactoring of ml models to operate with a sparse array instead of normal array
Tried to migrate trustchain to kotlin 1.4.30, still experiencing some bugs.
Start with db integration
Questions: it seems that a song that has been listened to gets downloaded into some local directory. We did not seem to find a proper location of this directory?
Here is the link to the repo: https://github.com/JCBrouwer/trustchain-superapp
main updates are on dev branch
Discussion of progress, 1 team member present. Please focus on getting code to compile and run as :1st_place_medal: priority. Then make stuff fancy. No need to repair or improve the superapp. Smile library is huge with ML,NLP, etc.) please ignore and do a low-performance, simple approach from scratch. Superapp might have issue with multiple Kotlin versions mixed.
Updates 15.03:
However:
Main updates:
We are planning to concentrate on testing everything with 2+ peers in the upcoming weeks. We would also like to add bayesian learning from here to achieve more accurate personal ranking learning.
@nata1y @JCBrouwer This is an example of working with kotlin-ipv8 outside of android: https://github.com/Tribler/trustchain-superapp/blob/master/musicdao-datafeeder/src/main/java/com/example/musicdao_datafeeder/DataFeeder.kt
./gradlew :musicdao-datafeeder:run --args="/home/user/torrents nopublish"
A minimum program might look like this:
fun musicCommunity(): OverlayConfiguration<MusicCommunity> {
val driver: SqlDriver = JdbcSqliteDriver(JdbcSqliteDriver.IN_MEMORY)
Database.Schema.create(driver)
return OverlayConfiguration(
factory = MusicCommunity.Factory(
settings = TrustChainSettings(),
database = TrustChainSQLiteStore(Database(driver))
),
walkers = listOf(RandomWalk.Factory())
)
}
fun discoveryCommunity() = OverlayConfiguration(
factory = DiscoveryCommunity.Factory(),
walkers = listOf(
RandomWalk.Factory(timeout = 3.0, peers = 20),
RandomChurn.Factory(),
PeriodicSimilarity.Factory()
)
)
fun ipv8() = IPv8(
endpoint = EndpointAggregator(
udpEndpoint = UdpEndpoint(
port = 8090,
ip = InetAddress.getByName("0.0.0.0")
), bluetoothEndpoint = null
),
configuration = IPv8Configuration(
overlays = listOf(
discoveryCommunity(),
musicCommunity()
), walkerInterval = 1.0
),
myPeer = Peer(JavaCryptoProvider.generateKey())
)
fun main(args: Array<String>) {
val ipv8 = ipv8()
ipv8.start()
}
Raw .APK: https://github.com/JCBrouwer/trustchain-superapp/raw/federated-music-recommendation/gossipML/app-debug-gossipML.apk To conclude:
Federated music recommendation has been merged :tada:
Master course on Blockchain Engineering project 2021
TEAM1: the first AI-dApp Create a dApp with federated machine learning to understand music taste of user and recommend more. Exchange training vectors on the blockchain overlay. Never share with others what Bittorrent music swarms you like. By only sharing training data you can discover new music and still preserve privacy. Prior work and this.
General Description - self-evolving AI-DAO (https://github.com/Tribler/tribler/issues/5944)
Student from Delft have created a blockchain-based alternative to Spotify. Completely decentralised. Uses Bittorrent for streaming, Bitcoin for payments to artists, Trustchain with IPv8 for music discovery and IPv8 for app-to-app connectivity. With multiple teams the aim is to take this code to the next level: self-evolution.
A total of 4 students teams (4-5 students in each team) will work together on a cutting-edge scientific problem: how to create a software system which can be expanded in real-time and increasingly become more 'intelligent'. Build upon the existing open source app by TUDelft on the Android Play store using blockchain technology: the Superapp. You will help transform essential parts of the music industry and replace them with open source software. Current code:
GIF: Browsing and streaming music with Bittorrent
GIF: Sending money to artists using Bitcoin
DAO - organisations in software
To make a "self-evolving" app we use the DAO concept. What is a DAO? Within the coming decades the future of jobs, employment and the nature of the firm will change profoundly. Automation, AI, and robots will replace many of today's jobs. A new type of company is a company without any employees, without any machines or physical infrastructure. A Decentralized Autonomous Organizations, DAO, only exists in software. It goes beyond smart contracts, it is a complete company inside software. DAO development is still in the experimental stage. Background reading. Very optimistic view on DAO, official US review of DAO by Securities and Exchange Commission.
Within this master course you can create your very own autonomous organisation, the AI-DAO. Learn to engineer a decentralised autonomous organisation, use the existing tools, and understand the security risks. The aim is to alter the nature of the firm in the Internet age, see the Nobel prize winning theory. Production cost become essentially cost-free. An organisation which exists purely in cyberspace. The AI-DAO is designed to be the first sustainable DAO. How can we empower leaderless organizations? How can it earn money from manipulating bits?
Scientific challenge: Self-evolving
A key step in an autonomous system is that it can evolve independently. This enables growth and evolution independently of any central organisation, sponsoring government, or tribe of volunteers.
You will collectively solve the problem of paying somebody to make new features in open systems which are fully decentralised. This goes further then paying somebody Bitcoins to create a new version. Decentralised technology is very robust to failures, manipulation, faults, and courtcases. For instance, The Internet itself is almost impossible to shutdown so is the "Tor darknet". With other teams you will address a key drawback of decentralised technology: difficult to update, nearly impossible to evolve, and lacks incentives to develop new features.
dApp ecosystem
"Distributed Applications" are a distributed way of running code. You will help develop an ecosystem of "global code". Code is running atop a blockchain and peer-to-peer (P2P) network that acts as a kind of operating system. This provides security, resilience, privacy, and novel features. This is related to smart contracts, but has no slow single virtual machine (all discussed in the online classes material). Background material, read FBASE trustworthy code execution
PNG: difference between cloud and decentralised Apps