xelaj / mtproto

🦋 Full-native go implementation of Telegram API
MIT License
1.22k stars 134 forks source link

Codegen: need improvements #4

Closed quenbyako closed 3 years ago

quenbyako commented 4 years ago

Current version of codegen for Telegram API is awful. We need to make it more stable and better for reading.

Also, we need to implement somehow auto documentation for types and methods. Looks like JSDoc way is good for us.

This branch is start working on codegen tool

semka95 commented 4 years ago

@ololosha228, do you think it's enough to scrape type/method/constructor description from telegram site and write a comment preceding type/method/constructor declaration? Or you want to write docs manually? Also, I am not sure if you need jsdoc, https://pkg.go.dev/github.com/xelaj/mtproto will look good when you add docs.

quenbyako commented 4 years ago

@semka95 I think it's not possible. But i found this file in tdlib, hope, that we can upgrade our api spec :smiley:

I'm absolutely sure, that we need something like jsdoc, not exactly it. If we can implement functions of jsdoc in codegen tool, it would be amazing.

semka95 commented 4 years ago

@ololosha228, I am not sure that you can use this .tl file, it describes tdlib api. I compared it with telegram_api.tl file, unfortunately they are completely different. So, if you want to generate docs, you have to scrape telegram site (for example this page contains table with all methods).

I still do not understand why do you need jsdoc if there is built in godoc? Can you please explain me?

quenbyako commented 4 years ago

@semka95 I have an idea:

What if we parse https://core.telegram.org/schema ALL links, then download it, and parse each html by any script? like Beautifulsoup on python, cause i think docs are generated too somehow, and html structure is not so hard to parse it.

semka95 commented 4 years ago

@ololosha228, yes, good idea, this is the best way to do it. Because there there is no page, that contains list of types and list of constructors, like methods page, which is very odd. So, yeah, parsing all links is the only way.

There is cool scraping framework for go - colly. Let me try to parse it, I will notice you about results.

quenbyako commented 4 years ago

Alright, so i made few examples, how to write docs in TL schema to generate... Docs, lol.

This example for constructors:

// @type Peer
// @constructor An empty constructor, no user or chat is defined.
inputPeerEmpty#7f3b18ea = InputPeer;
// @constructor Defines the current user.
inputPeerSelf#7da07ec9 = InputPeer;
// @constructor Defines a chat for further interaction.
// @param chat_id Chat idientifier
inputPeerChat#179be863 chat_id:int = InputPeer;
// @constructor Defines a user for further interaction.
// @param user_id User identifier
// @param access_hash **access_hash** value from the [user](https://core.telegram.org/constructor/user) constructor
inputPeerUser#7b8e7de6 user_id:int access_hash:long = InputPeer;
// @constructor Defines a channel for further interaction.
// @param channel_id Channel identifier
// @param access_hash **access_hash** value from the [channel](https://core.telegram.org/constructor/channel) constructor
inputPeerChannel#20adaef8 channel_id:int access_hash:long = InputPeer;
// @constructor Defines a [min](https://core.telegram.org/api/min) user that was seen in a certain message of a certain chat.
// @param peer The chat where the user was seen
// @param msg_id The message ID
// @param user_id The identifier of the user that was seen
inputPeerUserFromMessage#17bae2e6 peer:InputPeer msg_id:int user_id:int = InputPeer;
// @constructor Defines a [min](https://core.telegram.org/api/min) channel that was seen in a certain message of a certain chat.
// @param peer The chat where the channel's message was seen
// @param msg_id The message ID
// @param channel_id The identifier of the channel that was seen
inputPeerChannelFromMessage#9c95f7bb peer:InputPeer msg_id:int channel_id:int = InputPeer;

it can be generated to something like this:

Note on InputPeer description — it is exactly description from official docs, so some descriptions could be misleading. But anyway:

// InputPeer Peer
type InputPeer interface{...}

// InputPeerChat Defines a chat for further interaction.
type InputPeerChat struct {
    // Chat idientifier
    ChatID int `validate:"required"`
}

Is it looks good? I don't know actually. By the way, now i don't know, do we really need docs parser? cause write SLT is not too hard, just need much time)

Also, example for enums:

// @type Privacy key
// @enum Whether we can see the exact last online timestamp of the user
inputPrivacyKeyStatusTimestamp#4f96cb18 = InputPrivacyKey;
// @enum Whether the user can be invited to chats
inputPrivacyKeyChatInvite#bdfb0426 = InputPrivacyKey;
// @enum Whether the user will accept phone calls
inputPrivacyKeyPhoneCall#fabadc5f = InputPrivacyKey;
// @enum Whether the user allows P2P communication during VoIP calls
inputPrivacyKeyPhoneP2P#db9e70d2 = InputPrivacyKey;
// @enum Whether messages forwarded from this user will be [anonymous](https://telegram.org/blog/unsend-privacy-emoji#anonymous-forwarding)
inputPrivacyKeyForwards#a4dd4c08 = InputPrivacyKey;
// @enum Whether people will be able to see the user's profile picture
inputPrivacyKeyProfilePhoto#5719bacc = InputPrivacyKey;
// @enum Whether people will be able to see the user's phone number
inputPrivacyKeyPhoneNumber#352dafa = InputPrivacyKey;
// @enum Whether people can add you to their contact list by your phone number
inputPrivacyKeyAddedByPhone#d1219bdd = InputPrivacyKey;

And what can be generated:

// InputPrivacyKey Privacy key
type InputPrivacyKey uint32

const (
    // Whether we can see the exact last online timestamp of the user
    inputPrivacyKeyStatusTimestamp = 0x4f96cb18
    ...
    // and so on
)
semka95 commented 4 years ago

@ololosha228, I finally wrote simple scraper, check out its output - docs.txt (i wrote result in json file). What do you think, if I scraped enough information, or something else needed? I think this format is good, it would be easy to add docs in generator, for example for methods:

...
f = jen.Comment(methodName + docs[method.Name].Description)
file.Add(f)
...

and before this, passing map to GenerateMethods function

...
GenerateAndWirteTo(GenerateMethods, s, filepath.Join(outputDir, "methods.go"), docs["method"])
...

What do you think?

quenbyako commented 4 years ago

@semka95 this is awesome! Everything what we need is now in your exported file!

I think, that something like jsdoc is better than excluded file, and i already added into api_117.tl nearly 25% of docs. Now, with your export, i can do it even faster.

Also, i have planned to upgrade tl parser on weekend, so i hope, docs will be added in generated files on next week!

Thanks for helping!

semka95 commented 4 years ago

@ololosha228, you're welcome. I can send you a code if you need it.

quenbyako commented 4 years ago

@semka95 yes, it can really help me. You can make PR, or save it in repository. Or if it's just single file, you can post here link to snippet on golang playground

semka95 commented 4 years ago

I think, that something like jsdoc is better than excluded file, and i already added into api_117.tl nearly 25% of docs. Now, with your export, i can do it even faster.

@ololosha228, ok, I am missing something, you added docs to .tl manually? So then you will parse docs from this .tl code generator?

quenbyako commented 4 years ago

@semka95 yeup, that's it. Honestly, add comments automatically will be too hard, the idea of documentating tl files in tdlib looks pretty. Also, 1100 lines of objects, not so much. i spending 20-30 seconds per object, it's faster than writing lexer + parser + serializer + checker for single file)

semka95 commented 4 years ago

@ololosha228, well, if you take my scraper, you need to wait about 30 seconds to complete scraping and use parsed docs in your code (as I showed you example earlier), don't need to write code to generate .tl file, don't need to write manually .tl file, use scraped docs in your code generator straight away. That is my concern, why are you doing extra job?

semka95 commented 4 years ago

@ololosha228 maybe you confused because I sent you .json file? I sent you .json just to show you output of scraper, if I would embed scraper in your library it would return map with parsed docs, not saving it in .json file and parse it again.

quenbyako commented 3 years ago

Fixed in #45

ernado commented 3 years ago

@semka95 I've created scraper for Telegram documentation in gotd/getdoc, this is basically what you've described, but via goquery and pebble for caching.

add comments automatically will be too hard

I've also managed to integrate it with code generation, it was not too hard, but it is out of scope.

quenbyako commented 3 years ago

@ernado this is amazing!

Can we modify current tl schema file for adding docs to each struct and method?

ernado commented 3 years ago

It should be possible to embed parsed documentation to schema, but IMO it is impractical. I've chosen to embed documentation to generated code on codegen step.