Open synctext opened 7 years ago
[Copied from ticket #3670, moving activity here; oldest ticket] Master thesis idea:
info_hash
as a single requirement to identify, download, install, and run code.Possible sprints:
Security is a key concern In the first sprints I would recommend focusing on the core features first. Explore security after that. There are sandboxing features like RestrictedPython and chrootbuilder, etc. A "Sandboxed Python" would let you permit or forbid modules, limit execution slices, permit or deny network traffic, constrain filesystem access to a particular directory (floated as "/"), and so on.
Fun part: IPv8 execution engine is itself obviously an IPv8 distributed app. Can upgrade itself.
security threat observed in the wild: plugins(docker images) which steal cpu cycles for mining: https://arstechnica.com/information-technology/2018/06/backdoored-images-downloaded-5-million-times-finally-removed-from-docker-hub/
Towards first running code prototype:
Goal: self-sustainable ecosystem. Dream outcome: Runs flawlessly and expands after thesis is completed :-)
Status update: experimented with IPv8, learning about related work.
Let's just go on another dreaming excursion...:
IPv8 dApps are the fundamental building blocks for the re-decentralisation of The Internet. It represents an evolution from monolithic code, REST-based micro-services, towards trustworthy global computing. For over 20 years we have dreamed of composable software. We expand upon the idea of smart contract, by making them scalable and smarter. We never had software repositories which are server-free and maintenance-free. Now we finally have build the first autonomous isolated software pieces which understand the foundations of trust, can made decisions and learn, manage their own money, trade on marketplaces, can self-replicate, can do self-compilation, self-organise online elections, and have long-term reputational memory. Basic business dApps can be build for electronic signatures, legal document handling, product catalog publication, invoice processing, ordering, inventory finance, currency conversion, bank account management, micro-loans, and other basic banking functions.
Possible sprints to brainstorm about after Libtorrent basics are operational (our relentless incremental improvement methodology :-)
Even more ambitious is the offline-first approach, like Bramble, but they lack structural funding. Bramble builds upon Tor and handles all of the core functionality of "sending blocks of data back and forth, managing contacts, keeping channels between users secure and metadata-free, synchronizing state between a user's devices, and handling dependencies between pieces of data or expiring them when they get too old".
Got the first prototype to work. https://github.com/mitchellolsthoorn/ipv8-dapps-loader/
A torrent file is created from the 'execute.py' file which contains a hello world code example. This torrent is seeded on the local network by the 'seeder.py'. The loader downloads this torrent and loads and runs the first dapp of this system
Solid progress! Relentless incremental improvement, next step: trust, community building and code-reviews?
Brainstorm outcome: we use Github rating of software developers as a metric for quality of a code review. Just for initial bootstrapping. In later iterations we could hide the identity of the reviewers and have randomly picked witnesses validate the reputation of a software developer. Requirements for core functionality:
Brainstorming about the next upcoming sprint after skeleton voting is operational...
Related work, "Crev: dependency vetting with a web of trust" (Alpha stage), https://news.ycombinator.com/item?id=18824923
Progress update: Voting logic is implemented.
At the moment, nodes vote (+1) on dapps when they receive newly spreaded dapps. The dapps are gossiped to all nodes once. The gossiping happens through voting, where popular or good dapps are gossiped more than others.
I ran a successful experiment where all selected dapps where discovered by the nodes set-up.
Possible plans for next sprint is:
Meeting minutes:
ipv8 = IPv8(configuration)
trustchain_peer = Peer(ECCrypto().generate_key(u"curve25519"))
trustchain_community = TrustChainTestnetCommunity(trustchain_peer, ipv8.endpoint, ipv8.network, working_directory='./'+str(i))
ipv8.overlays.append(trustchain_community)
ipv8.strategies.append((EdgeWalk(trustchain_community), 10))
dapp_community = DAppCommunity(trustchain_peer, ipv8.endpoint, ipv8.network, trustchain=trustchain_community)
runtime-import("Bittorrent-SHA1");
discovered-DApps();
ImportError and/or path error:
>python loader/runner.py
Traceback (most recent call last):
File "loader/runner.py", line 5, in <module>
from community.dapp.community import DAppCommunity
File "/home/pouwelse/GITHUB/ipv8-dapps-loader/loader/community/dapp/community.py", line 1, in <module>
from block import DAppBlock
File "/home/pouwelse/GITHUB/ipv8-dapps-loader/loader/community/dapp/block.py", line 1, in <module>
from pyipv8.ipv8.attestation.trustchain.block import TrustChainBlock
ImportError: No module named pyipv8.ipv8.attestation.trustchain.block
Meeting minutes & progress:
Works!
2019-01-22T12:28:01+0100 [-]
.___ _____
__| _/ / _ \ ______ ______ ______
/ __ | / /_\ \ \____ \ \____ \/ ___/
/ /_/ / | \ |_> > |_> >___ \
\____ \____|__ / __/ | __/____ >
\/ \/ |__| |__| \/
2019-01-22T12:28:01+0100 [-] version 0.1
2019-01-22T12:28:01+0100 [-] [0] Show dApps
2019-01-22T12:28:01+0100 [-] [1] Create dApp
2019-01-22T12:28:01+0100 [-] [2] Exit
0
2019-01-22T12:28:03+0100 [-] Unhandled Error
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/twisted/python/log.py", line 103, in callWithLogger
return callWithContext({"system": lp}, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/log.py", line 86, in callWithContext
return context.call({ILogContext: newCtx}, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 122, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 85, in callWithContext
return func(*args,**kw)
--- <exception caught here> ---
File "/usr/lib/python2.7/dist-packages/twisted/internet/posixbase.py", line 614, in _doReadOrWrite
why = selectable.doRead()
File "/usr/lib/python2.7/dist-packages/twisted/internet/process.py", line 291, in doRead
return fdesc.readFromFD(self.fd, self.dataReceived)
File "/usr/lib/python2.7/dist-packages/twisted/internet/fdesc.py", line 94, in readFromFD
callback(output)
File "/usr/lib/python2.7/dist-packages/twisted/internet/process.py", line 295, in dataReceived
self.proc.childDataReceived(self.name, data)
File "/usr/lib/python2.7/dist-packages/twisted/internet/_posixstdio.py", line 77, in childDataReceived
self.protocol.dataReceived(data)
File "/usr/lib/python2.7/dist-packages/twisted/protocols/basic.py", line 571, in dataReceived
why = self.lineReceived(line)
File "/home/pouwelse/GITHUB/ipv8-dapps-loader/twisted/plugins/cli_plugin.py", line 84, in lineReceived
self.menu_items[int(line)].values()[0]()
File "/home/pouwelse/GITHUB/ipv8-dapps-loader/twisted/plugins/cli_plugin.py", line 107, in show_dapps
msg(self._colorize("info_hash: " + dapp['info_hash'] + " name: " + dapp['name'], 'green'))
exceptions.TypeError: cannot concatenate 'str' and 'buffer' objects
The issues with the CLI interface and the menu were fixed. There were some conversion errors when printing out values from the database, which caused the errors seen on the previous sprint update. The program was also receiving a lot of packet malformatted errors on network packets. This turned out to be caused by an increased block size type that wasn't fixed in the Test TrustChain Network. A PR with an update was created and merged in IPv8 and uploaded to pip.
The database was updated to store more structured data, so the program doesn't have to query the unstructured trustchain every time. This change also allowed for better checking of software guards like if the current dApp is known to the system before executing an action on it, and preventing double votes.
By making this change, there could grow inconsistencies between the trustchain and the database. This could happen through manual tampering of the database or trustchain, or through glitches in the blocks received flow. To prevent this, verification checks are executed 5 seconds after start-up, that detect these inconsistencies and fix them.
The previous distribution mechanism of dApp loader only distributed dApps to connected peers when:
This strategy could lead to a dead network with passive users and a slow onset time for dApps. To help solve this problem a dApp crawler was created and added, that searches connected peers for unknown dApps, to decrease this onset time. Since the crawler only crawls the first level connected peers, it doesn't impact the scalability significantly.
The crawling task is executed asynchronously 10 seconds after start-up and every hour afterward.
To prevent malicious nodes from infecting the network with fake votes to promote certain dApps, a double vote detection check has been added to detect these. Upon detection, a message is shown to the user of the system to indicate this. This double vote detection happens on start-up but could be adjusted to run more often.
In the previous sprint the program consisted out of multiple parts:
These parts have now all been integrated into the dApp community. This integration led to some problems with the (dynamic) package import, but from preliminary testing, it should all work now. Context options were added to the menu to reflect these:
In the development of the execution engine, several problems were encountered:
Executable This was the basic test approach, but could be useful for certain dapps
IPv8 overlay This is the original idea of adding overlays to the same IPv8 instance that dApp loader is using. This, however, causes problems with common services like TrustChain that need to be adapted for certain dapps. Also, port numbers would be more challenging to manage if extra services are started e.g. REST manager. This type is however needed/preferred when running multiple modularised application modules.
Twisted service Twisted services could provide an alternative or secondary approach, where each service would run their own IPv8 instance and thus run on its own port. This would eliminate the common services problem and increase the security, but also the overhead.
Success! :smile: Operational Trustchain crawler as a dApp. Meeting minutes, focus on either market dApp, trust dApp or implement dynamic GUI. Best focus on the toughest problem, with roadmap consequences: GUI stuff.
Sprint goal: in 2 weeks make prototype of dynamic GUI. prior notes "cross-platform with Android support, possibly Android Webview {default GUI, with IPv8 services in background; dynamic GUI update)". Prior work of QR code scanning and reading in Android. Great basic dApp dependency.. Test for Android: Trustchain crawler GUI.
Create demo .apk (ipv8-service and IPv8 app)
Future sprints: market dApp, Bitcoin wallet, QR-codes + IBANs. (ring-fenced local and global REST api)
2019-02-04T14:26:12+0100 [-] version 0.1
2019-02-04 14:26:12,698:DEBUG:dApp-community: Getting all dApps from catalog
2019-02-04 14:26:12,698:DEBUG:persistence: Getting all dApps from catalog
2019-02-04T14:26:12+0100 [-] 0 dApps found:
2019-02-04T14:26:12+0100 [-] [-1] Return to previous menu
0
2019-02-04T14:26:15+0100 [-] Unhandled Error
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/twisted/python/log.py", line 103, in callWithLogger
return callWithContext({"system": lp}, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/log.py", line 86, in callWithContext
return context.call({ILogContext: newCtx}, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 122, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 85, in callWithContext
return func(*args,**kw)
--- <exception caught here> ---
File "/usr/lib/python2.7/dist-packages/twisted/internet/posixbase.py", line 614, in _doReadOrWrite
why = selectable.doRead()
File "/usr/lib/python2.7/dist-packages/twisted/internet/process.py", line 291, in doRead
return fdesc.readFromFD(self.fd, self.dataReceived)
File "/usr/lib/python2.7/dist-packages/twisted/internet/fdesc.py", line 94, in readFromFD
callback(output)
File "/usr/lib/python2.7/dist-packages/twisted/internet/process.py", line 295, in dataReceived
self.proc.childDataReceived(self.name, data)
File "/usr/lib/python2.7/dist-packages/twisted/internet/_posixstdio.py", line 77, in childDataReceived
self.protocol.dataReceived(data)
File "/usr/lib/python2.7/dist-packages/twisted/protocols/basic.py", line 571, in dataReceived
why = self.lineReceived(line)
File "/home/pouwelse/GITHUB/ipv8-dapps-loader/twisted/plugins/cli_plugin.py", line 134, in lineReceived
self.dapp_menu_items[int(line)].values()[0](line)
File "/home/pouwelse/GITHUB/ipv8-dapps-loader/twisted/plugins/cli_plugin.py", line 224, in download_dapp
self.dapp_community.download_dapp(self.context['info_hash'])
exceptions.KeyError: 'info_hash'
This sprint I ran into a couple of hurdles that slowed down my progress.
My main focus was on the following tasks:
These experienced hurdles were:
Completed tasks:
Task still working on:
REST API:
Basic Website:
Meeting minutes:
Todays meeting minutes:
Storyline input: This thesis presents a framework and implementations of a truly universal binary.
This sprint I focused on:
Due to computer networking lab preparation, I had less time than I would have wanted. I really wanted to get a good start on writing the thesis, but this didn't work out in the end. I also had a hard time figuring out how to get the actual problem and subject in writing. I think this is an important discussion point, because getting a clearer picture of this would help me in writing the thesis.
Things I would like to work on in the next sprint:
Android app:
Meeting notes: proposal for problem description. Risk is that it comes across as merely a plug-in system. dramatic like: "this thesis presents a multi-plaform binary and new paradigm for trustworthy computing. We build on the large body of work around smart contracts and make it more generic, scalable, removed global consensus, and need for oracles." etc. Compare Debian, NodeJS, Winamp plugins, Facebook plugins, and Dapps. Argue that this thesis represents the next step in the continued evolution of computing models.
Next steps:
This sprint was not very productive. I wanted to get a lot more done, but I had to spend all my time on finishing the lab assignments for the computer networking course since otherwise, the lab would be in big troubles. I am now done with my part and transferring all the other tasks to different people, so I can focus more on my thesis.
I did work on a very rough beginning of my first chapter of the thesis and I am continuing with that for the next week before I am going to the conference
Draft of outline for the story line: Master_Thesis.pdf
Nice read, three phases of reusable code
Next sprint suggestions:
small history of software reusability+user-extensible software
, first Chapter 1 text draft
Possible thesis final epic experiment:
Related work around running untrusted code in kernel space.
Berkeley Packet Filter (BPF) has turned into a broad Linux kernel mechanism for untrusted code. eBPF can run user-supplied programs inside of the Linux kernel, at ring 0 even. @qstokkink this might be inspiration for generic "IPv8 communities" and may unlock smarter smart contract by finally providing parallelism. It is especially designed for high-bandwidth stuff. Details:
Related work also having our features; https://www.kickscondor.com/on-dat/
"It’s clear that there are tremendous advantages here: Dat is apps without death. Because there is no server, it is simple to both seed an app (keep it going) and to copy it (re-centralize it). .. In fact, it now becomes HARD:IMPOSSIBLE to take down an app. There is no app store to shut things down. There is no central app to target. In minutes, it can be renamed, rehashed, reminified even (if needed)—reborn on the network."
Related work, mention in .tex thesis. Live code injection in Python: https://pyrasite.readthedocs.io/en/latest/GUI.html Extensive tooling.
Plugin malicious code attack: https://harry.garrood.me/blog/malicious-code-in-purescript-npm-installer/
Thesis progress meeting, state-of-code, discuss the defining thesis experiment&chapter. Outcome of brainstorm. It is possible to create several trust functions. dataset and existing code for putting in IPv8 dApps:
Current feature includes a crawler. Ask neighbors for active gossip for all known dApps. Focus for next sprint:
2019-08-16T14:42:03+0200 [-] Unhandled Error
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/twisted/application/app.py", line 396, in startReactor
self.config, oldstdout, oldstderr, self.profiler, reactor)
File "/usr/lib/python2.7/dist-packages/twisted/application/app.py", line 311, in runReactorWithLogging
reactor.run()
File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1243, in run
self.mainLoop()
File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1252, in mainLoop
self.runUntilCurrent()
--- <exception caught here> ---
File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 878, in runUntilCurrent
call.func(*call.args, **call.kw)
File "/home/pouwelse/GITHUB/ipv8-dapps-loader/loader/community/dapp/community.py", line 388, in _check_votes_in_catalog
creator = tx_dict[DAPP_BLOCK_TYPE_VOTE_KEY_CREATOR] # type: bytes
exceptions.KeyError: 'creator'
Related UMD work: This repository formalizes the design and implementation of the Universal Module Definition (UMD) API for JavaScript modules. These are modules which are capable of working everywhere, be it in the client, on the server or elsewhere.
Some progress was made, but not the amount that I was aiming for. This is mainly due to underestimating the time it took to prepare and actually move to Delft and me becoming ill.
A proof of concept was made to see how modules could be used to interchange logic that can change the output behavior of an application. I ran into some issues getting the two algorithms that we chose to run since the code was very old and outdated and without any documentation. Eventually, I managed to update all the outdated functions and figure out how to get it running.
I haven't managed to make any progress on the report size of the project.
To make the process more structured and focussed I have created the following planning: Week 37: Get better, Wrap up some last moving items, Expand the trust function modules proof-of-concept Week 38: Draft chapter Problem description and related work Week 39: Expand on code reviewability aspect of the project Week 40: Draft chapter Android experiment and possibly trust experiment Week 41: Draft chapter module and framework
Thesis outline discussion:
Worked on thesis the last week. Added draft of the introduction and the problem description. Had a hard time trying to separate the different sections: Introduction, Problem description, and related work.
Some personal notes:
Progress can be found in: https://github.com/mitchellolsthoorn/master-thesis
Scientific thesis title (less engineering focus): "Implementation of a collective-intelligence coding framework"
Suggestion for problem description related work: Some theoretical grounding for coding frameworks exists in the realm of "enterprise software". Evolvability is the cardinal concept in "Normalized Systems Theory", which states that the competitive environment changes rapidly of organizations and therefore they must adapt themselves accordingly in an agile way. This leads to software development tools with versioning in data, APIs, and data processing. This leads to a system with minimal dependencies. We reproduced the following four principles from [NST] here:
– Separation of Concerns, stating that a processing
function can only contain a single concern (i.e., change
driver or each part which can independently change);
– Data version Transparency, stating that a data structure
that is passed through the interface of a processing
function needs to exhibit version transparency (i.e., not
impacting processing functions in case it is updated);
– Action version Transparency, stating that a processing
function that is called by another processing function
needs to exhibit version transparency (i.e., not impacting
its calling processing functions in case it is
updated);
– Separation of States, stating that the calling of a
processing function within another processing function
should exhibit state keeping (i.e., before calling another
processing function, a state should be kept).
Back from conference in the states.
Remaining days I continued working on writing of the thesis. I still have to resolve points from the previous meeting notes, but I wanted to focus first on getting some more concepts on paper, so it is easier to see the bigger picture and the storyline. Afterward, I will start refactoring to correct parts that could be made more scientific and less storytelling.
I have worked on creating diagrams for different parts of the application, so they can be more easily explained. Especially activity diagrams would help with explaining the process for some of the decentral operations. I have started writing the Design and Experiment/Evaluation chapter and the section about the Android app experiment.
In the coming two weeks I want to focus on finishing the code and writing tests.
Suggestion new title:
The Lehman law matches the topic very well. I can definitely use that. Thank you.
The other resource you mentioned about the Normalized System Theory seems to describe the development of highly abstracted and engineered software application that allow for agile development, without breaking and with minimal change. I don't know if this is very suitable
please mention: live code updates: https://github.com/julvo/reloading
Need one more week to finish the final product and experiment. After that focus completely on adding remaining concepts and rewriting and refining the report.
Had trouble finding good metrics for the experiment that won't be very subjective or that depend on the composition of the network. Technical metrics are not suited best for comparison with other platforms.
Design chapter structure/notes:
Quote @mitchellolsthoorn : Normalized System Theory == purist, holy grain and therefore inefficiency beyond usage. Too complex. Without simplicity all agility is lost.
Python does not need to be motivated. All engineering details are merely documented, not meticulously defended. "We use Trustchain as the distributed ledger in our experimental implementation."
IPv8 runtime model versus Ethereum fundamental brokenness (e.g. scalability and cost). The Ethereum model is not long-term sustainable:
Back in October’17, an investor sent 1,700 ETH to a contract (AirSwapDEX) with a gas price
of 400,000 Gwei and gas limit of 592,379. The Tx failed for some odd reason but the
investor was charged whopping 236 ETH ($122,086 as per today’s price)
"Trustchain does not have a native cybercurrency such as Ethereum, it provides a transaction recording ledger". Plus distributed trackers are currently still a central dependency (e.g. 25MByte bootstrap is still ongoing work).
Your thesis work requires a name for ease of referencing. Pasta-Frame: the next evolution of modularized code execution. Key property is permissionless code execution at near-zero cost. The code execution architecture defines the maximum complexity of the code that can be produced. JVM, Nodejs, CPAN perl modules and other real-world framework and the connection between modules determine the Maximum Complexity of Applications (MCA) which a single company, an global consortium or open source community can create. We devised the first architecture to take the MCA as the cardinal design optimisation. Science what defines MCA? What constrains/boosts MCA? The interconnection fabric is the key determinant for the MCA. How data flows between modules, how the future-proofing is arranges, how any piece of code can interconnect with any other code, and how can we devise the universal module interconnector?
Feel free to cite this 2003 work on dynamic configuration and dynamic runtime loading of code modules with a dependency graph with running code:
General remark: please always keep the science first and omit engineering details. Runs!
"Dependency managers now exist for essentially every programming language: Maven Central (Java), NuGet (.NET), Packagist (PHP), PyPI (Python), and RubyGems (Ruby) each host more than 100,000 packages. The arrival of this kind of fine-grained, widespread software reuse is one of the most consequential shifts in software development over the past two decades. And if we're not more careful, it will lead to serious problems." https://queue.acm.org/detail.cfm?id=3344149
I continued working on the report. Although, the written isn't going as fast as I want. I struggled a lot during the last sprint as I am not making the progress I want to. I have rewritten the structure of the Introduction and moved parts of the structure around. I want to discuss if this new structure is a better fit for the thesis:
Introduction:
report (25).pdf review of current thesis:
ToDo: animated .GIF of whole trustchain .APK starting, loading, module fetching, trustchain starting, crawling, and browsing ? Use as leading example in thesis experiment? Merge Chapter 5 in other chapter?
LATEST raw .pdf of master thesis
2.4. Runtime Engine
more as a problem description, instead of a solution direction.
Like: "A key bottleneck for re-usability and usability is the lack of support within the execution environment. Requirements for our envisioned runtime engine are as follows...". Requirement: "runtime support"?Comments:
Goal: Distributed apps with efficient execution model and event-processing.
Possible operation scenario: you download some blockchain "genesis code" to connect to the blockchain. You now can download the basic data and complete the bootstrap. This basic data would somehow be trustworthy and executable. Trust and abuse prevention is the key challenge.
this work fits into our long line of work around this theme.
Plugins and apps have been around for decades. In 1999 I've created one within Winamp. Numerous app ecosystems emerged, in 2008 a vulnerability was published in Facebook Apps. Dealing with vulnerabilities, spam, and malware has proven to be a hard problem. Tezos aims to have a self-describing blockchain, but they did not yet address any of the fundamental security issues.
In 2008 the Tribler research team reached a key milestone: an app store and execution platform without any server. See our research report on this full self organising system, this work has a distant relation to the Ethereum smart contracts which launched years later. Real 2008 widget code without any servers, no app store and no app platform servers. In 2015 we created self-compiling Android apps ("Autonomous smartphone apps: self-compilation, mutation, and viral spreading").
Estimated repository size of various widget platforms in 2008
The 2008 widget source code was simply gossiped around between clients. Each client collected raw .py files and displayed a local app store interface like:![image](https://cloud.githubusercontent.com/assets/325224/26027775/5c0987e2-3814-11e7-8706-98baa346b76e.png)
Our 2008 architecture for serverless widgets:![image](https://cloud.githubusercontent.com/assets/325224/26027772/2b21791e-3814-11e7-91b7-e26c692922e8.png)
full 2009 thesis