solomonxie commented 5 years ago

Here are some memos I made when the idea of cloning a Spotify and making it customizable came across my mind.

TODO

[x] Design Database & ORM
[ ] Quick implementation: ORM Data Insertion
- [x] Spotify
  - [x] Accounts
  - [x] Tracks
  - [x] Albums
  - [x] Artists
  - [x] Playlists
- [x] MusicBrainz
  - [x] Recordings
  - [x] Releases
  - [x] Artists
- [ ] Filesystem
  - [ ] MP3 file
  - [ ] Folder structure
[x] Quick implementation: Data Query
- [x] Accounts
- [x] Tracks
- [x] Albums
- [x] Artists
- [x] Playlists
[x] Quick implementation: Web API
- [x] Postman Test
- [x] Spotify
  - [x] Authentication
  - [x] Retrieving
  - [x] Paging
- [x] MusicBrainz
  - [x] Authentication
  - [x] Retrieving
[x] Quick implementation: Flask
[ ] Quick implementation: Tagging (Auto-tagger)
- [ ] Spotify -> MusicBrainz
  - [x] Track
  - [ ] Album
  - [ ] Artist
- [ ] Local files -> MusicBrainz
[ ] Design Code Architecture
[ ] Modularization
[ ] Implement Models & Unit tests
[ ] Create Frontend Pages
[ ] Integrate Frontend & Backend
[ ] Integration Test
[ ] Deploy on Server
[ ] Documentation
[ ] Publish beta

solomonxie commented 5 years ago

PROJECT DESIGN

snip20181207_162

REQUIREMENTS ANALYSIS

Music Player
- Display music library [playlists, artists, albums, songs, recommendations]
- Online stream music playing
➢ Personal Data Management
- Backup [playlists, liked songs, albums, artists]
  - Save to database [sqlite]
  - Export to local files [m3u, csv]
  - Import from local data [m3u, csv, sqlite]
- Apply changes to Spotify
Media File Organizing
- Auto Tagging: take Spotify's as dominant, others as supplementaries.
  - By folder structure
  - By filename
  - By existing tags
  - By fingerprint
- File renaming
  - Rename file
  - Re-structure folder
Media File Sources
- Import From Local Files
  - Only import recognizable files
  - Group un-recognized files into one folder to be further processed
- Refer to Cloud drives: [Webdav, Google drive, AWS-S3]
- Refer to Spotify 30s preview file: "Spotify API"
- Refer to Youtube [albums, songs]: "Track Connectors"
- Refer to Search engines: "Piratebay"

PROTOTYPE & Frontend

No need for Frontend design, it's supposed to be a completely copy from Spotify.

Tech Stacks

Hosting: AWS Lightsail Ubuntu 16.04
Language: Python + JS
HTTP Server: Nginx + WSGI
Application Framework: Django
Databases: Postgresql + Redis + Sqlite
DevOps: Bash scripts, Git, Travis CI, Docker, Ansible
Utilities: Github Issues, Axure prototype
Editors: Vim, Sublime Text, Visual Studio Code
Business tools: Google Ads

https://stackshare.io/solomonxie/spoilfy

Modulization (by Class)

Root:

[ ] Providers
- [ ] Spotify
  - [x] ORM
  - [ ] API
  - [ ] Mange
- [ ] MusicBrainz
  - [ ] ORM
  - [ ] API
  - [ ] Manage
[ ] Template
- [ ] Tracks
- [ ] Albums
- [ ] Artists
- [ ] Playlists
[ ] Route.py

Modulization (by Page)

[ ] Music Library (Spotify)
- [ ] Track: [Display, Play, Add, Delete, Export, Sync, Import, Export]
- [ ] Albums: [Display, Play, Add, Delete, Export, Sync, Export]
- [ ] Artists: [Display, Play, Add, Delete, Export, Sync, Export]
- [ ] Playlists: [Display, Play, Add, Delete, Update, Import, Export, Sync]
- [ ] Recent Played: [Display, Play]
- [ ] Home: [Display, Play]
- [ ] Browse: [Display, Play]
- [ ] Recommends: [Display, Play]
[ ] Dashboard (Management)
- [ ] Sync Library
  - [ ] From Spotify: [API Fetching]
  - [ ] From Xiami
  - [ ] From iTunes
- [ ] Local File Importation: [Tagging, Relating]
- [ ] Cloud Drive Importation
  - [ ] Webdav: [Tagging, Relating]
  - [ ] Google Drive: [Tagging, Relating]
  - [ ] Dropbox: [Tagging, Relating]

Interfaces Definitions

Instead of directly extract data from database to Jinja2 Templates, it's more flexible to make data as RESTful API.

When the frontend requires some data, like an album, the program will first check whether the data already exists in the database on server. If not, it'll then request Spotify API, MusicBrainz API and other APIs. What it does next is doing both saving retrieved new data and also presenting it to the user.

RESTful API Design

This API only response data from CURRENT User Library. Data will be returned in JSON format.

BASE URL: http://music.solomonxie.top/spoilfy/api/v1

Entry points:

/tracks: Return a list of user's Liked Tracks.
/artists: Return a list of user's Followed Artists.
/albums: Return a list of user's Saved Albums.
/playlist: Return a list of user's Saved Playlists.

Backend work flow

Init/Update User Library Flow:

@1 Fetch User full data from Spotify API ->
@2 Process retried data ->
@3 Save to Local Spotify Library ->
@4 Cross searching MusicBrainz Library
@5 Save to User Library.

User Library API:

Client sends request ->
@1 API Entry point ->
@2 Get LIST from User's library ->
@3 Check information from Local Public Library for each item in the LIST ->
@4 If not exists, RETRIEVE each item's data from Provider's Web API ->
@5 Process retried data ->
@6 Save to Local Public Library ->
@7 Return to client.

User flow:

Import library ->
Music Play or Manage

Tagging flow:

Import files ->
Tagging ->
Relate to MusicBrainz ->
Relate to Spotify

solomonxie commented 5 years ago

Project: Spotify Localized @Oct 3, 2018

Spotify really annoys me out by connecting issues through vpn. Well, not really spotify's fault. And not having premium membership of Spotify annoys me out too. Well again, not really spotify's fault.

All this and that, just makes me thinking, "What if I can just have a music player plays local mp3 files and performs as good as spotify?"

Well, it can be on cloud as well. I can either store music files on local iphone, or remote cloud server. That's not a big deal.

The thing is, how to perform good SUGGESTIONS for limited songs I have, say about 2000 songs (10GB).

Actually, Spotify might not play for me 2000 of songs, he just got enough informations about songs, and can suggest me good songs.

So if I trained my SUGGESTION SYSTEM well, I don't need to have many songs and still get the most out of it.

solomonxie commented 5 years ago

Project: `Free-Spotify` or `Spoilfy` @Nov 8, 2018

在自己的服务器上建立一个类似Spotify的在线听歌网站，像Emby一样的服务器，但是和Spotify更贴合。本质上是一个私有云，不对外公开的。所以不太涉及版权问题。

参考同类产品：

Emby
OwnCloud
NextCloud
Seafile

利用Spotify API，读取用户的所有信息，备份到自己的服务器，生成和SPotify一样的界面。一边提供了备份的功能，一边提供在线听歌。

问题在于，歌曲的来源问题。需要用户自己去下载，然后上传到服务器。然后，Free-Spotify会自动识别歌曲(MusicBrainz)，让你听。不用手动一首一首去对应。这样一来，就算无法连接spotify，还是可以听自己标记过的歌和playlist。

由于歌太多，自己一个一个去搜索下载太麻烦。所以系统会生成一个列表：告诉你现在还缺哪些歌手的哪些专辑需要下载。默认的话可以直接自动搜索youtube上的歌并播放出来，这样甚至不需要上传歌也没有版权问题。像Whatsthesong一样。

涉及技术问题：

[x] Spotify API （Python）
[ ] MusicBrainz自动识别local歌曲并匹配Spotify中的歌曲
[ ] Python后端的HTTP服务器
[ ] Youtube歌曲搜索（很难匹配上准确的）

更新

来源问题： Spotify API其实提供每首歌的30秒预览，足够一般浏览了，这样能够快速填充实际内容。然后再配合本地歌曲库来补充，那么就非常完善了。也就是说，本地有的，直接听全曲。本地没有的，提供30s预览，并提示用户自行去下载（提示需要下载的artist和album），并提供youtube搜索结果链接。

推荐问题：推荐其实也不需要自己构建了，API中有Recommandation，和主页的大众推荐。那么就相当于Spotify的所有功能都有了。

也就是说，可以完全完全完全模拟一个Spotify，然后再其之上增加个人数据的导入导出，再加上本地音乐的完全可控，个人服务器运行，就是一个非常讨喜的在线音乐播放平台了。

家庭音乐共享到公网：数据库存在本地的Webdav中，配合frp连接，公网服务器建立一个网页，读取家里音乐库，这样就达成了统一。

更新

Youtube等歌曲来源问题，可以通过这个库解决：Track Connectors

更新

music-story的映射是不全、不准且收费的，免费版可以获取的量非常少。

型号MusicBrainz的Database下载仓库中找到Spotify和MusicBrainz的映射全表，希望能覆盖尽量多内容。

参考项目Github：metabrainz/mbspotify 数据下载位置：http://ftp.eu.metabrainz.org/pub/musicbrainz/mbspotify

数据格式为：

id mbid spotify_uri cb_user is_deleted
264555  258a2f31-21fe-4bf9-b254-12a09120a034    spotify:album:70yMNdgyIj9SrQXFmdJKx9    49f6627d-ebb2-49cf-86ec-c25529d71e6d
264705  cac0be2c-509f-40a9-b1ac-a70aa2243a54    spotify:album:5hB4jVN4ZHpubyiMmW81K1    67f6f0f4-436f-4c77-8b53-9773f89ed2f5

solomonxie commented 5 years ago

DATABASE DESIGN

It could either be relational database or NoSQL database.

STRUCTURE

Here is the basic structure of the databases.

User Library -> Postgresql
- User Data -> Can export to Sqlite as a stand-alone database
  - _uUsers: {uid, uname, ...}
  - _uHosts: {id, uid, host_id, {auth}, ...} -> 3rd party Provider's login info(JSON)
  - _uTracks: {tid, uid, name, recentPlayDate, playedCounts, rate, memo}
  - _uAlbums: {abid, uid, title, likedDate, memo}
  - _uArtists: {atid, uid, name, likedDate, memo}
  - _uPlaylist: {plid, uid, title, info, createDate, [tids], memo
  - _uRecommends: {id, uid, [tids], [abids], [atids], [plids], rcm_date} -> Multiple recommends
- Mappings -> Can be shared to
  - hosts: {host_id, host, info, uri, {auth}, ...}
  - _mpTracks: {tid, spotify_id, musicbrainz_id, itunes_id, [fids], ...} -> Extensible Fields
  - _mpAlbums: {abid, spotify_id, musicbrainz_id, itunes_id, ...} -> Extensible Fields
- Files -> File System: Webdav, Google Drive, Dropbox -> Sqlite
  - _fsTracks: {id, tid, path, {file_infos}, {tags}, ...}
  - _fsPlaylists: {id, path, ...}
Public Libraries -> Incremental, grows as more retrieves happen
- Mapping Public Library -> Postgresql, auto-imported from Local library
  - _mpTracks
  - _mpAlbums
  - _mpartists
- Spotify Public Library -> Postgresql
  - Tracks: {...}
  - Albums: {...}
  - Artists: {...}
  - Playlists: {...}
- MusicBrainz Public Library -> Postgresql
  - Tracks: {...}
  - Albums: {...}
  - Artists: {...}

EXPLANATION

Independent Libraries explanation:

User library data tables
- Users Data tables should store personal data, without any detailed music information, that being said, it only store "pointers" to "Mappings", etc., u_Tracks.tid -> mp_Tracks.tid -> Spotify.tracks.id, MusicBrainz.tracks.id.....
- Mappings is the ESSENTIAL MIDDLE LAYER to connect User data with variant providers' database. This layer makes libraries independent to each other and highly extensible.
Public libraries are independent libraries from different providers, such as Spotify, MusicBrainz. No personal data should "pollute" public libraries, so to keep it easy to extend, update.
- Mapping Public Library is a stand-alone database gathers the mapping rules and can be shared to the public. This library won't collect any user data, but only mapping rules of tracks, albums, artists between different providers.

~u_sources is the User data table to store pointers to specific Public libraries and specific resources, such Spotify's albums, MusicBrainz's tracks. u_sources is the essential middle layer to connect User data with variant providers' database.~ ~In the table u_sources, sType stands for "Source Type" , takes value of "Track, Album, Artist, Playlist"; sHost stands for Source Host, such as Spotify, MusicBrainz, Youtube; xid stands for Uncertain ID, which takes value of id pointing to the related table's id, such as "[Org]Spotify's [Type] Track table's [xid] track_id".~

WORK FLOW

Work Flow: When new song is fetched, it'll be processed and will be persistent in database, and it won't be deleted when user delete his own liked songs. As this will allow database of music to consistently grow.

ERP Design

Updated ERP Design

solomonxie commented 5 years ago

ORM ARCHITECTURE

Object-Relational-Models, which are the Coding "classes" referring to "tables" in databases.

`{Class: Song}`

class Song:

    gene = {}
    resources = []

    def addResource(resource):
        r = resource(self.sid)
        if r:
            self.resources.append(r)

    def save():
        # save data to Database
        for r in self.resources:
            r.save()

`{Class: Album}`

class Album:

    aid = ''
    arts = ''
    name = ''
    desc = ''

`{Class: Recource}`

Base class:

class Resource:
    def __init__(sid):
        pass
    def save():
       pass

Child class: SpotifyResource:

class SpotifyResource(Resource):
    def __init__(sid):
        pass
    def save():
       pass

Child class: YoutubeResource:

class YoutubeResource(Resource):
    def __init__(sid):
        pass
    def save():
       pass

`{Class: ResourceProcessor}`

solomonxie commented 5 years ago

Design of ERP with URI [UPDATE]

Updates:

Replace ID with URI in the format of: <Group>:<Type>:<ID>
Abandon most Foreign Keys to add flexibilities.
Treat everything as a RESOURCE, and introduce a REFERENCE to connect.

The core idea is simple: One real existence, can have multiple identities, or located in multiple providers. In this case, the Real Existence would be referred to Track / Album / Artist / Playlist, and they might exist with different identities: Spotify's resource, MusicBrainz's info, Local file...

Mapping

Table Groups

URI

Workflow

solomonxie commented 5 years ago

Spoilfy的目标是实现个人音乐的私有云系统。解决方案：实现的基础是将互联网各大主流音乐平台的Track/Album/Artist进行匹配映射，并提供一个统一的ID，然后将个人在各个平台上的收藏历史记录导出并与ID挂钩。在实现映射的基础上，对个人的音乐收藏历史进行长期稳定的管理，并能实现个人收藏的自动资源搜索及自动下载。此基础上更能开发为个人音乐云盘，随时随地访问。实现方法：首先通过互联网各大主流音乐平台的官方或开源WebAPI将个人账户的收藏历史导出为JSON文档。然后通过后台建立的ORM映射到数据库。再用A平台的每首歌、每个专辑、每个音乐家的信息到B平台进行搜索获取对应的ID及详细信息。这样每个资源都有最少两个平台的ID。随着项目支持的平台增加，每个资源的ID身份也增加。此时个人收藏的每个资源，也都具有了“跨平台”能力，意味着当一个平台失效无法访问后，还有其他的平台支持。Spoilfy的重要功能之一，还有将个人所储存的本地MP3文件与互联网音乐平台对应。MP3的信息对应是通过提取文件Metadata与MusicBrainzAPI进行搜索对应。实现了MP3的对应，Spoilfy就具备了个人音乐云系统的基础。由于已有了多平台映射的能力，也就意味着系统能够直接在后台离线下载所有个人收藏音乐的MP3资源。技术实现：数据库设计：Spoilfy的数据库已经过了多次重构，主要难题是多平台资源映射问题。目前的方案是：每个音乐平台都创建5个基础表：Account/Tracks/Albums/Artists/Playlists；然后独立出一个References中间表，专门存放每个资源在各个平台的ID映射关系。另一方面，为了减少数据库表的数量和搜索复杂度，项目使用三段式"URI"替代传统的ID，即："Platform:Type:ID"。同样为了降低复杂度，数据库还全面取消了数据库级的外键限制，改为由程序控制。核心功能开发：项目基于Python开发的，分4大子模块：WebAPI/ORM/Ops/Mappers。其中WebAPI模块只负责各个音乐平台的信息获取；ORM模块通过SQLAlchemy建立所有表格的ORM模型，并为了增加代码可维护性、降低复杂性，抛弃了SQLAlchemy的Relationship逻辑，使用原始的SQL语句进行高级查询工作；Ops模块负责所有音乐的导入导出、高级查询、建立关联等功能；Mappers负责所有的跨平台映射功能。应用部署：项目将采用Nginx+uWSGI+Flask的HTTP服务器架构，以达到轻量化目的。为实现用户便捷安装，程序还将打包至Docker镜像中发布。

Gitissues的目标是以Github Issue为基础建立个人博客系统。 WHY: Github Issues具备个人博客的所有条件，更有Ctrl-V一键上传图片；自带评论区等功能。但它没有真正的编辑历史追踪、信息备份、自定义页面等功能。所以决定开发Gitissues。 SOLUTION: 通过Github API和Github Hooks，实现在Issues文章自动获取，并将更新保存到本地Git repo自动Commit/Push。在此基础上，程序将自动生成Jekyll/Gitbook/Readthedoc等静态网站生成器的"Front Matter"以达到个人网站的自动生成。 DEVELOPMENT: 程序基于Python开发，通过Github API将用户的Issues文章下载为JSON原数据，并提取文章及所有元信息。程序会自动解析Markdown格式文章的标题、标签、分类、目录等信息，并生成Front matters。然后自动commit和push更新。

solomonxie commented 5 years ago

IMDb for music

Unified ID

fletort commented 4 years ago

Is your interesting project in pause ?

solomonxie commented 4 years ago

Is your interesting project in pause ?

Yes at the moment because of the load of works lately. Would like to continue when got more free time ;)

joshhansen commented 2 years ago

Love this idea

solomonxie / Spoilfy

Spoilfy Design #1

TODO

PROJECT DESIGN

REQUIREMENTS ANALYSIS

PROTOTYPE & Frontend

Tech Stacks

Modulization (by Class)

Modulization (by Page)

Interfaces Definitions

RESTful API Design

Backend work flow

Project: Spotify Localized @Oct 3, 2018

Project: `Free-Spotify` or `Spoilfy` @Nov 8, 2018

更新

更新

更新

DATABASE DESIGN

STRUCTURE

EXPLANATION

WORK FLOW

ERP Design

Updated ERP Design

ORM ARCHITECTURE

`{Class: Song}`

`{Class: Album}`

`{Class: Recource}`

`{Class: ResourceProcessor}`

Design of ERP with URI [UPDATE]

Mapping

Table Groups

URI

Workflow

IMDb for music

solomonxie / Spoilfy

Spoilfy Design #1

TODO

PROJECT DESIGN

REQUIREMENTS ANALYSIS

PROTOTYPE & Frontend

Tech Stacks

Modulization (by Class)

Modulization (by Page)

Interfaces Definitions

RESTful API Design

Backend work flow

Project: Spotify Localized @Oct 3, 2018

Project: Free-Spotify or Spoilfy @Nov 8, 2018

更新

更新

更新

DATABASE DESIGN

STRUCTURE

EXPLANATION

WORK FLOW

ERP Design

Updated ERP Design

ORM ARCHITECTURE

{Class: Song}

{Class: Album}

{Class: Recource}

{Class: ResourceProcessor}

Design of ERP with URI [UPDATE]

Mapping

Table Groups

URI

Workflow

IMDb for music

Project: `Free-Spotify` or `Spoilfy` @Nov 8, 2018

`{Class: Song}`

`{Class: Album}`

`{Class: Recource}`

`{Class: ResourceProcessor}`