ssc-oscar / lookup

A mirror of bitbucket.org/swcs/lookup
1 stars 4 forks source link

Escaping output of showCnt blob output #13

Closed k----n closed 3 years ago

k----n commented 3 years ago

Is it possible to escape the output of showCnt blob to be a single line like when you run the showCnt commit 2 command to lookup a commit?

It looks like the commit message printed is generated differently than blob:

versus blob:

audrism commented 3 years ago

It should be a single line. I.e., it has no \n. So if you read it as binary and break by \n it should work. Some commit messages/author names might contain \r or other special symbol that may make it appear that it bleeds to another line.

audrism commented 3 years ago

Also, if you want to see raw commit, use option 3 not option 2

audrism commented 3 years ago

yes, it would be not difficult to add option for a blob to be on a single line:how would you like it to be encoded (i.e., \n converted to NEWLINE or something like that?

k----n commented 3 years ago

Yes, essentially I'd like to bypass the using python to get a blob on a single line.

This is the steps needed with python:

>>> import subprocess
>>> cmd = 'echo 00000070ee74a26858ccd0f4e4d6d95469ffaeac | ~/lookup/showCnt blob'
>>> p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
>>> line = p.stdout.read()
>>> retval = p.wait()
>>> line[:-1]

To produce: 'FROM maven:3-jdk-8\nVOLUME /tmp\n\nCOPY pom.xml .\nRUN mvn dependency:resolve\n\nCOPY src ./src\nRUN mvn clean package\n\nENV JAVA_OPTS=""\nENTRYPOINT [ "sh", "-c", "java $JAVA_OPTS -Djava.security.egd=file:/dev/./urandom -jar /app.jar" ]\n'

audrism commented 3 years ago

Ok, pull on lookup and try option 1

k----n commented 3 years ago

I'm still getting multiline output with this command? echo 00000070ee74a26858ccd0f4e4d6d95469ffaeac | ~/lookup/showCnt blob 1

Output:

FROM maven:3-jdk-8
VOLUME /tmp

COPY pom.xml .
RUN mvn dependency:resolve

COPY src ./src
RUN mvn clean package

ENV JAVA_OPTS=""
ENTRYPOINT [ "sh", "-c", "java $JAVA_OPTS -Djava.security.egd=file:/dev/./urandom -jar /app.jar" ]
audrism commented 3 years ago

Perhaps do git pull once more on ~/lookup?

k----n commented 3 years ago

Works now, perfect! Thanks a lot!

k----n commented 3 years ago

A note for anybody else that might come across this...

To print the output correctly in Python, you will also have to add another regex to escape backslash for output format 1:

$code =~ s/\\/\\\\/g;

and also escape ' or ".

I think it's good to keep in mind the escape characters of your language you intend to parse/print the output with (e.g. https://www.w3schools.com/python/gloss_python_escape_characters.asp)

k----n commented 3 years ago

I guess another issue to be aware of is that splitting newlines with regex does not always work (I assume because of encoding):

> echo 11ee1cb857207ef723cdb864ea52cae1e52c47ee | showCnt blob 1
11ee1cb857207ef723cdb864ea52cae1e52c47ee;FROM nginx\n\nADD conf/php-services.conf /etc/nginx/conf.d/php-services.conf\n\n# 開啟 80 port 給機器?\n?部使用\nEXPOSE 80\n

vs

> echo 11ee1cb857207ef723cdb864ea52cae1e52c47ee | showCnt blob
FROM nginx

ADD conf/php-services.conf /etc/nginx/conf.d/php-services.conf

# 開啟 80 port 給機器內部使用
EXPOSE 80

Notice how of 給機器內部使用 gets interpreted as ?\n? in the single line output.

I'm not sure how this can be remedied.

audrism commented 3 years ago

The only way to do it is to create a specialized encoding that prints well on a single line

k----n commented 3 years ago

Thanks, looks like base64 works.

> echo 11ee1cb857207ef723cdb864ea52cae1e52c47ee | ~/lookup_latest/showCnt.perl blob | base64 -w 0 | xargs echo
RlJPTSBuZ2lueAoKQUREIGNvbmYvcGhwLXNlcnZpY2VzLmNvbmYgL2V0Yy9uZ2lueC9jb25mLmQvcGhwLXNlcnZpY2VzLmNvbmYKCiMg6ZaL5ZWfIDgwIHBvcnQg57Wm5qmf5Zmo5YWn6YOo5L2/55SoCkVYUE9TRSA4MAoK

> echo 11ee1cb857207ef723cdb864ea52cae1e52c47ee | ~/lookup_latest/showCnt.perl blob | base64 -w 0 | xargs echo | base64 --decode
FROM nginx

ADD conf/php-services.conf /etc/nginx/conf.d/php-services.conf

# 開啟 80 port 給機器內部使用
EXPOSE 80
audrism commented 3 years ago

Ok, should I change blob 1-liner output to use base64 then?

k----n commented 3 years ago

Yes, please. I think that would be the better option.

Should there also be an option to base64 encode commit messages as well?

audrism commented 3 years ago

the format/debug options are a mess: on-line blob does base64 (~/lookup_latest/showCnt.perl blob 2) for commit use 7 (~/lookup_latest/showCnt.perl commit 7)

Note that \n and \r still needs to be quoted as base64 appears to keep them. To decode, replace quoted by real, then decode

b103a5008840212382f95d387f58f0f752082570